TEXT FLY WITHIN 
THE BOOK ONLY 



VERS 

RARY 


OU 174253 


03 ^ 

73 < 

> m 

73 7 } 








INTRODUCTION TO 

ECONOMIC STATISTICS 




BY 

GEORGE R. DAVIES, Ph. D. 

PROFESSOR OF SOCIOLOGY, UNIVERSITY OF NORTH DAKOTA 



NEW YORK 
THE CENTURY CO. 
1993 


Copyright, 19 22^ by 
The Century Jo. 


Printed in U. S.A. 



PREFACE 


The application of statistical methods to special 
fields such as demography, education, and economics 
has in recent years advanced very rapidly. As a re- 
sult a study of statistical principles must be confined 
ctiiefly to one of the special fields if it is not to be 
lost in the multiplicity of specific methods and illus- 
trations. This text-book has been written with the in- 
terests of the student of economics in mind. 

A common difficulty which the teacher of statistics 
encounters is a lack of provision for laboratory work. 
An attempt has here been made to supply this need 
by furnishing illustrative problems, graphs, and data 
which may be worked over by the student, and by add- 
ing to each chapter a list of related exercises. The 
exercises will be found extensive enough so that Ihe 
teacher may select those which are adapted to his re- 
quirements. The longer problems should be subdivided, 
and the parts assigned to different members of the 
class. Both the tables and the exercises may very well 
be supplemented by the use of data drawn from such 
sources as the Survey of Current Business, the Monthly 
Labor Review, and the Statistical Abstract of the 
United States (Superintendent of Documents, Govern- 
ment Printing Office, Washington, D. C.). The topics 
covered in the text represent probably a maximum of 
what can be mastered by a college class in a term. 
Perhaps it may be found advisable to omit certain 
topics, such as interpolating for quartiles, theories 



vi 


PREFACE 


of price indexes, parabola trends, seasonal variations, 
and the more complex methods of correlation. 

Nearly all the material here presented has been ac- 
cumulated from experience in the statistical laboratory 
and class-room. Particular attention has been given 
to the requirements in respect to fundamental theory 
of the statistical departments of the larger banks and 
business houses. Some of the recently developed 
methods of handling business barometers have there- 
fore been touched upon, and some attention has been 
given to the theory of price and production indexes. 

The book is an outgrowth of the undergraduate 
course in statistics given by the writer at Princeton 
University during the school year 1920-1921. This 
course was modeled in its general features upon the 
course given the preceding year by Professor J. H. 
Williams, now of Harvard University. The writer 
wishes to acknowledge his indebtedness to Professor 
Williams for the general plan of the laboratory exer- 
cises, as well as for many specific suggestions. Thanks 
are also extended to Professors F. A. Fetter and E. 
W. Kemmerer of Princeton for their interest and en- 
couragement, and to Professor W. F. Willcox of Cor- 
nell who read the first draft of the manuscript and 
made several valuable suggestions for its revision. In- 
debtedness is also acknowledged to the following for 
their kind permission to reprint data : Mr. Roger W. 
Babson, Professor Stanley E. Howard, Bureau of La- 
bor Statistics, National City Bank, National Bureau of 
Economic Research, National Industrial Conference 
Board, Review of Economic Statistics, and the Quar- 
terly Journal of the University of North Dakota. 



CONTENTS 


CHAPTER PAGE 

I Tabulation 3 

Ilj Types and Measures of Dispersion .... 20 

III Indexes of Wages and Prices 47 

IV Quantity Indexes and Their Uses 74 

V Trends and Cycles 100 

VI Correlation 131 

Appendix I 

Laboratory Material and References . . . 153 

Appendix II 

Tables of Powers and Roots 157 

Appendix III 

A Picture of the Progress of the United 
States During 120 Years of National Life . *160 

Index 161 




INTRODUCTION TO 
ECONOMIC STATISTICS 




INTRODUCTION TO ECONOMIC 
STATISTICS 

CHAPTER I 
TABULATION 

The term “statistics” when used to designate a 
branch of study, implies an exposition of certaih 
methods employed in presenting and interpreting the 
numerical aspects of a given subject. The science of 
statistics consists, therefore, of principles and methods, 
rather than of data. The principles are essentially 
the same whether the application is made to biology, 
demography, education, or economics. But the de- 
tailed methods in these and other fields have in late 
years become so specialized that it is hardly practi- 
cable any longer to study statistics in the abstract. 
The field of application here adopted is chiefly that of 
general economics . 1 Illustrations will be given of the 
methods employed in organizing data, in computing 
and*employing indexes, and in measuring trends and 
correlations. 

1 It should be noted that economic statistics are commonly distin- 
guished from business statistics. The former subject studies general 
market conditions* while the latter subject deals with the details of a 
specific business establishment, and is therefore an adjunct of account- 
ing. Business statistics vary so greatly from one establishment to an- 
other that it is difficult to generalize from them. Their problems 
consist largely of the application of statistical principles to specific 
situations. For a discussion of the distinction see “The Scope of Busi- 
ness Statistics,” by R. P. Falkner, in the Quarterly Publications of 
the American Statistical Association , June, 1918, pp. 24-29. 

3 



4 INTRODUCTION TO ECONOMIC STATISTICS 


Preliminary Schedules. Statistical field work com- 
monly begins with the preparation of schedules in 
which to enter the desired data. These schedules may 
be in the nature of questionnaires, or they may be mere 
forms in which to copy certain records. Since such 
schedules will vary with the particular task in hand, 
few rules can be laid down for their preparation. Ex- 
perience will teach, however, that they must be ve-y 
carefully worked out in advance. In the first place, it 
must be determined as precisely as possible what data 
will be needed. If the schedule is in questionnaire 
form, great care must be taken to make the questions 
unambiguous and easily comprehended by those who 
are to answer. Errors may often be checked by asking 
a question in two ways, at different places in the list ; 
as by calling for both the age and the date of birth. 

After the preliminary schedules have been compiled, 
the^statistician begins to organize his data, so that con- 
clusions may be deduced and presented in simplified 
form. In so doing he will make use of the processes 
of tabulation. These may best be explained by taking 
an example. As such an example, we shall choose 
certain wage schedules which have been recorded in 
the Aldrich Report on ‘‘Wholesale Prices, Wages, and 
Transportation” (Senate Report No. 1394, dated 
1893). The data here used will be found in Vol. IV, 
pages 1463-1497, and refer to a Connecticut woolen 
mill designated as Establishment No. 86. The wage 
rolls as recorded in the report cover about half a 
century, ending in 1891. They show the daily wages 
paid to each class of workmen employed in the mill 
in January and July of each year. We shall select 



TABULATION 5 

for tabulation only the wages paid in July of 1870, 
1880, and 1891. 

The Primary or General Purpose Table. In tran- 
scribing the selected wage schedules into a primary 
table, it will be found advisable to edit the original 
figures by making certain minor modifications. An in- 
spection of the schedules will show that most of the 
wage rates are expressed as multiples of five cents. 
Of^course, we may assume theoretically that the exact 
economic values which are approximated in the actual 
wages must form a continuous series, instead of one 
having a regular interval. That is, if the wages could 
be expressed as theoretically exact values, and if a 
very large number of workers were involved, the rates 
would be separated by intervals of only a small frac- 
tion of a cent. The case may be compared with the 
measurement of the height of the individuals in a 
large group. If the measurements are taken with 
very precise instruments, the results will be expressed 
in hundredths or perhaps thousandths of an inch. But 
for practical purposes measurements to perhaps the 
nearest quarter inch are sufficiently accurate. In the 
same way when the employer made an offer of wages he 
set his figure at a multiple of five cents, or in the case 
of the larger wages at a multiple of twenty-five cents. 
For his purpose, such an estimate of the market value 
of labor was sufficiently accurate. The exceptions will 
be found to be rates that were paid to an especially 
large number 0 of workers. In such cases differences 
of a cent or two are of considerable consequence. 

Continuous and Discrete Series. A series of meas- 
urements or values occurring only at more or less reg- 



6 INTRODUCTION TO ECONOMIC STATISTICS 


ular intervals is said to be discrete. Sometimes a series 
will be naturally discrete, as when flowers are clas- 
sified by the number of petals, or when the spots in 
a number of dice-throws are tabulated. But a series 
which is theoretically continuous becomes artificially 
discrete when a limit of accuracy is determined upon, 
as when height is measured to the nearest quarter- 
inch, or wages are expressed in multiples of five cents. 
Our series of wage rates, as it stands in the records, 
is therefore discrete at intervals of five cents, except 
for a few items. 

For purposes of classification it is desirable to have 
our series of wage rates regularly discrete throughout. 
Since it is not possible to break down the five cent 
intervals to smaller ones, it will be necessary to modify 
such rates as 67c. and $1.28 so as to classify them as 
multiples of five. If there were only a few scattered 
cases .of this sort affecting only a small number of 
workers, they might merely be entered at the nearest 
five cent intervals; that is, 67c. could be entered as 
65c. and $1.28 as $1.30. But since the number of c work- 
ers at these irregular rates is exceptionally large, 
such a procedure might result in too great a degree of 
inaccuracy. We shall therefore apply a familiar arith- 
metical device and break up the given number of work- 
ers at each inconvenient rate into two groups, one 
having a higher and one a lower rate than that stated, 
but maintaining together the same average wage. The 
procedure may be illustrated as follows'! 


15 workers at 67 c. 


9 workers at 65c. 
16 workers at 70c. 



TABULATION 


7 


This result is obtained by taking 3/5 and 2/5 of the 
fifteen workers (the fractional parts of the five cent 
interval which lie between 67c. and the nearest mul- 
tiples of 5) and placing the larger of the two results 
at the rate expressed by the multiple of five nearest to 
67c., that is, at 65c. The smaller of the two results is 

TABLE I 

WAGE BOLL IN A CONNECTICUT WOOLEN MILL 
JULY OP SPECIFIED YEABS 


OCCUPATION 

1870 

1880 

1891 


WAGB 




WAGB 

Burlers 

1* 

$ .80 

11* 

$ .80 

14* 

$1.10 


2* 

.85 

8* 

.85 



Card cleaners 

1 

.70 

4 

1.15 

1 

1.10 


2 

1.10 



1 

1.15 


1 i 

1.25 



1 

1.20 






2 

1.25 

Card tenders 

1 

.55 

2 

.60 

1 

.75 


2 

.60 

2 

.65 

1 

.85 


3 

.65 



1 | 

.90 


1 

.80 



1 

1.00 

Carpenters 

1 

2.75 

1 

2.75 

i 

2.75 

Cloth inspectors 



1 

1.60 

1 

L90 

Drawers-in 

1* 

1.25 

1* 

1.90 

1* 

1.75 




1* 

1.95 

1* 

1.85 






4* 

1.90 

Dressers 



2 

1.50 

1 

1.25 




2 

1.55 

2 

1.60 






1 

1.65 






3 

1.75 

Dyers 

2 

1.35 

3 

1.25 

1 

1.15 


1.50 

5 

1.30 

13 

1.25 






1 

1.40 






2 

1.50 

Firemen 


1.50 



3 

1.50 

Foremen-burlers 





1 

2.25 

Fullers and giggers 

4 

1.10 

2 

1.05 

2 

1.05 

3 

1.15 

4 

1.20 

4 

1.10 


2 

1.25 

1 

1.25 

3 

1.15 


1 

1.35 



3 

1.25 


2 

1.50 



1 

1.50 

Handers-in 

2* 

.40 

2* 

.40 

5* 

.50 

Harness hands 





1 

1.50 

Loom fixers 

2 

1.50 

3 

1.95 

1 

2.00 


mm 

1.10 



2 

2.10 


m 

1.15 



4 

2.20 














8 INTRODUCTION TO ECONOMIC STATISTICS 

TABLE I (Continued) 


OCCUPATION 

1870 

1880 

1891 




■3 


WAGE 

Machinists 

n 

$2.75 

1 

$3.00 

1 

$2.75 

Machinists — helpers 


1.75 



1 

2.10 

"Number sewers 





3* 

90 

Overseers— -Carding dept 

II 

3.50 

1 

3.25 

1 

4.00 

* 1 Dyehouse 

n 

3.75 

1 

3.25 

1 

4.25 

u Finishing dept 

n 

2.50 

1 

3.00 

1 

3.50 

* 1 Fulling dept 

h 

2.75 

1 

2.50 

1 

2,50 

tf Spinning dept 

mm 

2.75 

1 

3.00 

1 

2:75 

( ‘ Spooling dept 

1 

2.25 

1 

2.25 

1 

2.50 

u Weaving dept 

1 

3.00 

1 

3.00 

3 

3.00 

Piecers 

1 

.75 

2 

.70 




1 

.80 

8 

.75 



Second hands 

1 

1.50 

1 

1.25 

2 

1.75 


1 

1.75 

1 

1.50 



Sewers 



1* 

1.25 

1* 

1.00 






2* 

1.25 

Shearers 

1 

1.15 

6 

1.25 

8 

1.35 


1 

1.40 






1 

1.50 





Sorters 

2 

2.00 

2 

1.80 

2 

1.70 



2.75 

1 

2.75 

1 

2.75 

Speckers 

1 

1.35 

1* 

.90 

2* 

.80 


■ ' 


2* 

.95 

2* 

.85 

Spinners— jack and mule 

. \ 

1.75 

i 

1.50 

4 

1.25 



1.80 



7 

1.30 

Spoolers 


.60 

5* 

.65 

9* 

.75 


/''.I 


4* 

.70 

14* 

.80 

Teamsters 


1.50 

1 

1.50 

1 

1.60 

Twisters 

■ 


7* 

.90 

4* c 

.90 

Watchmen 

lUTM 

1.50 

1 

1.50 

1 

1.30 






1 

1.35 






1 

1.40 

Weavers 

4* 

1.05 

79 

1.20 

2* 

1.30 



1.10 

10* 

1.40 

8* 

1.35 



1.30 

2* 

1.45 

89 

1.70 



1.35 



60 

<*1.75 

Weavers — pattern 





3 

1.25 






2 

1.35 






4 

1.50 






1 

1.75 


H 




1 

2.00 

Winders 


1.00 

6* 

c.95 

12* 

1.15 




8* 

1.00 

17* 

1.20 

Yarn carriers 

: f<m 

1.25 

1 

1.75 

1 

1.85 

Totals 

108 


213 


361 



Female employes. 





TABULATION 


9 


placed at the rate of 70c. That the average wage has 
not been changed by this operation is shown by the fact 
that 

15 X 67c. = 9 X 65c + 6 X 70c. 

By the foregoing method all irregular wage rates 
may now be reduced to approximately equivalent mul- 
tiples of five cents. Of course, when a fraction appears 
in the operation the nearest whole number is taken. 
Thus modified, our wage schedules appear as shown 
in Table I, which may be taken as an example of a 
primary or general purpose table. 

The Frequency Curve. The tabulation which we are 
about to undertake has for its immediate object the 
presentation of the frequency distribution of the wages 
in question. Since the concept of a frequency distribu- 
tion has a concise theoretical basis, it will be of ad- 
vantage to turn briefly at this point to the theoretical 
aspects of the subject. 

If we take the square of a binomial, as a 2 + 2ab +V, 
we have three classes of values as expressed by the 
letters and their exponents, and these classes have 
frequencies expressed by their coefficients, 1:2:1. If 
instead of the second power we take the fourth power 
of the binomial, we have five classes of values, having 
frequencies respectively of 1 :4 :6 :4 :1. These frequen- 
cies graphed as vertical blocks will form a figure such 
as is outlined by the dotted line in Fig. 1. If instead 
of the fourth .power of the binomial we should take the 
thousandth or millionth power, the steps in this blocked 
frequency polygon would practically disappear, and 
the figure would approach a smoothed bell-shaped 
curve as indicated in the same figure. This theoretical 



10 INTRODUCTION TO ECONOMIC STATISTICS 


distribution of classes of values, or something similar 
to it, may be discovered to exist very generally in 
natural and social phenomena, and is also the expres- 
sion of what are known as the laws of chance. The 
length of leaves on a given tree, the height of a group 
of persons, the per cent net earnings of corporations, 



Figure 1. Normal frequency curve (solid line), and an approxima- 
tion to it (dotted line) based on the fourth power of a Monomial. 
Horizontal scale in units of standard deviation from average (0). 
Quartile deviation (Q.D.), and average deviation (A.D.) 


or the deviations from normal of a price index through 
a series of years, will show when properly classified 
and graphed an approximation to the bell-shaped fre- 
quency curve. In order to discover whether this curve 
is inherent in a given set of data, it is necessary first 
that the data contain a considerable number of items, 
and second that the classification be suitably adjusted 
to the range and numbers. If, for example, the height 
of a hundred persons were taken merely to the nearest 



TABULATION 


11 


foot, only two or three classes would appear. If, how- 
ever, the measurements were taken accurately to .01 
inch, so many classes would appear that the fre- 
quencies would be hopelessly scattered. But if we 
made our measurements to the nearest inch, we would 
obtain a series of frequencies somewhat like the fol- 
lowing (the classes ranging from 60 to 73 inches in- 
clusive) : 1 :2 :4 :7 :10 :14 :16 :16 :12 :8 :5 :3 :1 :1. These fre- 
quencies when graphed will give an approximation to 
the bell-shaped curve. It will be necessary, therefore, 
in tabulating our wage data to work out experimentally 
the most suitable classification. 

The Tally Sheet and Frequency Table. To facilitate 
the classification of the wages selected for study, a tally 
sheet is drawn up as shown in the first two columns of 
Table II. We shall show here the details for only the 
1891 figures, leaving the 1870 and 1880 figures to be 
worked out by the*student. In studies where the items 
must be entered singly, it is customary to use the famil- 
iar ‘‘four and cross” method of tallying (ft// = 5), 
but this is inappropriate when the items are already 
partially grouped as they are here. In this case the 
number of workers as shown in the wage roll is entered 
in the appropriate line of the tally sheet, much as jour- 
nal items are posted to a ledger. Each entry is sepa- 
rated from adjacent ones by a dash. Each line is then 
totaled, and the result entered under the five cent 
column of “Frequency Classes.” A series of values 
thus arranged according to magnitude is known as an 
array. 

An inspection of the five cent frequencies shows that 
we have discovered only a very rough approximation 



12 INTRODUCTION TO ECONOMIC STATISTICS 


TABLE II 

WAGES IN A CONNECTICUT WOOLEN MILL, JULY, 1**1 


Daily 

Wage 

Tally 
(No. of 
Workers) 

Frequency 
Classes — Inter- 
vals op: 

2 

Daily 

Wage 

Tally 
(No. of 
Workers) 

Frequency 
Classes — Inter- 
vals op: 

2 

U 

15<£ 

254 

cn 

o 

H 

154 

254 

© 

»o 

.40 









Carried over . 

349 

851 

349 

349 


.45 

.50 

6 - 

5 

5 

— 


5 

2.50 

1-1- 

2 


— 


851 

.55 






5 

2.55 







.00 




5 


5 

2.60 



0 

2 



.05 



0 



5 

2.65 







.70 





42 

5 

2.70 





6 


.75 

1 - 9 - 

10 

29 



15 

2.75 

l - l - l - l - 

4 

4 



*55 

.80 

2 - 14 - 

16 

37 


31 

2.80 







.85 

1 - 2 - 

8 



34 

2.85 




4 



.00 

1 - 8 - 4 - 

8 




42 

2.90 



0 




.95 



10 





42 

2.95 







1.00 

1 - 1 - 

2 




44 

3.00 

3 - 

3 




858 

1.05 

2 - 

2 


58 


46 

3.05 



3 




1.10 

14-1-4- 

19 

38 


65 

3.10 




3 



1.15 

1 - 1 - 8 - 12 - 

17 



117 

82 

8.15 







1.20 

1 - 17 - 

18 




100 

3.20 



0 


3 


1.25 

2-1-13-8-2-4-3 

28 

56 



128 

8.25 






858 

1.80 

1.85 

7- 1-2- 

8- 1-8-2- 

10 

19 

— 

59 


13 

157 

3.30 

3.35 



0 

0 


1.40 

1 - 1 - 

2 

21 



159 

3.40 







1.45 






159 

3.45 







1.60 

2-3-1-1-4- 

11 




170 

3.50 

1 - 

1 

1 



859 

1.65 



14 



170 

3.55 







1.00 

2 - 1 - 

8 


106 


173 

8.60 




1 



1.06 

1 - 

1 




174 

8.65 



0 




1.70 

2 - 89 - 

91 

159 


180 

265 

8.70 





1 


1.75 

1 - 8 - 2 - 00 - 1 - 

67 



832 

3.75 






859 

1.80 






332 

3.80 



0 




1.86 

1 - 1 - 

2 

7 

74 


834 

3.85 




0 



1.00 

1 - 4 - 

5 



339 

8.90 







1.95 






339 

3.95 



1 




2.00 

1 - 1 - 

2 

2 


— 

841 

4.00 

1 - 

1 




860 

2.06 






841 

4.05 







2.10 

2 - 1 - 

8 


9 


844 

4.10 



0 

1 



2.16 



7 



844 

4.15 







2.20 

2.26 

4 - 

1 - 

4 

1 

— 


10 

348 

349 

4.20 

4.25 

1 - 

1 

1 

— 

2 

861 

2.80 



1 



349 












1 


849 





1 









849 











2 



849 








Hi 

Totals 

849 

851 

349 

849 



Totals 

861 

361 

361 

861 





T 


to the theoretical frequency curve. We therefore ex- 
periment with larger groupings to see if we can thus 
obtain more distinctive results. Classes at fifteen, 
twenty-five, and fifty cent intervals are shown in the 
designated columns. These classes are found by 
adding the five cent frequencies falling within the 
limits indicated by the horizontal bars. Obviously, 





TABULATION 


13 


several variations of these groupings could be made 
by beginning at different points in the scale ; but the 
arrangement here chosen, which gives regular classes 
back to the zero point, is the most natural one to take. 
Comparing the different groupings, we see that the 
twenty-five and fifty cent intervals give results as 
smooth as we are likely to get. Since a classification 
with larger intervals promises to be too indefinite, it 
is useless to carry our frequency classifications 
further. 

The Derived or Special Purpose Table. In order to 
give a summarized presentation of the fifty cent fre- 
quencies, Table III has been drawn up. In this table 
the classes are designated by stating the upper and 
lower limits, as $.50 to $.95, inclusive. If the class 
interval is so arranged that the mid-point falls at a 
round number, the class may be designated by this 
number. Thus the twenty-five cent frequencies could 

TABLE III 

WAGES IN A CONNECTICUT WOOLEN MILL 


WAGE PEE DAY 
DOLLARS 

NUMBER AND PERCENTAGE OF WORKERS DISTRIBUTED 
ACCORDING TO DAILY WAGES, JULY — 

1870 

1880 

1891 

NO. 

% 

NO. 

% 

NO. 

% 

0 to .45 

2 

1.8 

2 



0 

.50 to .95 

18 

16.7 

58 

27.2 


11.6 

1.00 to 1.45 

54 

50.0 

126 

59.2 


32.4 

1.50 to 1.95 

22 

20.4 

17 


BMX 

49.8 

2.00 to 2.45 

3 

2.8 

1 


10 

2.8 

2.50 to 2.95 

6 

5.6 

3 

1.4 

6 

1.7 

3.00 to 3.45 

1 

0.9 

6 

2.8 

3 

0.8 

3.50 to 3.95 

2 

1.8 

0 

0 

1 

0.3 

4.00 to 4.45 

0 

0 

0 

0 

2 

0.6 

Total 

108 



100 

361 

100 


Aldrich Report, pp. 1463 ff. 























14 INTRODUCTION TO ECONOMIC STATISTICS 


be tabulated according to the nearest quarter dollar. 
The usual method is, however, to state the limits, as 
here shown. 

In contrast with Table I, which presented in orderly 
form practically all the detailed data of our study, this 
table is a derived or special purpose table. It aims to 
present only certain features of the wage schedules, 
and therefore purposely omits details. It is in the 
nature of a generalization, condensing the original 
facts into as brief a compass as is practicable. 

The preparation of such a table usually calls for 
both consideration and skill. The bracketing system 
used in the headings, or captions, is obvious — the wider 
blocks bracket and designate the smaller ones imme- 
diately beneath. Which set of subdivisions are entered 
as captions, and which are entered in the stub to the 
left, is usually determined by the exigencies of space. 
The arrangement of the details will depend upon the 
nature of the table. In census tables, for example, 
the current date is placed in the first column, to the 
left, because of its greater importance, thus reversing 
the chronological order. In such tables, also, totals 
will be given the place of prominence at the top, 
directly beneath the caption. In an elaborate table 
percentage columns will be placed together, or ,in a 
separate table, to allow of easy comparison. Correla- 
tive items in the caption or stub should be arranged 
in some logical order, whether by magnitude as in the 
case of the frequency classes, chronologically as the 
successive wage distributions, geographically as in the 
case of a census list of states, by order of origin, or 
merely alphabetically. 



TABULATION 


15 


In computing the percentage columns, each fre- 
quency is divided by the total. The work will be suf- 
ficiently accurate if computed on a slide rule or string 
chart. The percentage totals will not necessarily come 
to exactly one hundred per cent, because of the inac- 
curacies involved in cutting off decimals. If it is de- 
sired, however, they may be brought to the proper 



total merely by turning one or two items that stand 
at «or near five in the first decimal dropped. Thus 
7.55 may be written 7.6 or 7.5, according to which is 
needed to make up the total of one hundred; or 7.56 
might even be written 7.5 if no better can be done. A 
sufficient degree of accuracy can always be secured by 
extending the number of decimals retained. But, in 
general, the special purpose table should not exhibit 
meticulous accuracy. Decimals should be shortened 



16 INTRODUCTION TO ECONOMIC STATISTICS 

or dropped, fractions avoided, and large numbers 
rounded. 

The Frequency Polygon. The percentage columns 
of Table III may now be graphically presented by 
“frequency polygons,” as shown in Figure 2. The 
percentage columns are here used in preference to 
the absolute numbers because they reduce the three 
polygons to the same scale. It will be seen that in 
drawing the frequency polygons the points represent- 
ing the percentage frequencies are plotted directly 
above the mid-point of each class, respectively. These 
points are then joined, the individual years being dis- 
tinguished by different kinds of lines. The graph 
brings out very well the general advance in minimum, 
maximum, and mean wages that occurred in 1891 in 
the mill under consideration. The data for a single 
year may also be graphed as a “rectangular histo- 
gram,” as illustrated in the next chapter (Fig. 3, p. 25 ; 
solid line). Both the frequency polygon and the rec- 
tangular histogram are frequency curves approxi- 
mately expressed. 

Difficult Features of Tabulation. Before leaving the 
subject of tabulation, a few general suggestions may 
be made. In the tabulation considered in this chapter, 
the problem of interpreting the term “wages” has 
been solved for us by the Aldrich Report. But if we 
had undertaken the task of filling in the original 
schedules by actual field work, we should have been 
faced with the difficulty of drawing a soinewhat arti- 
ficial line distinguishing wages from salaries, and 
perhaps from commissions and other direct or indirect 
income. From the standpoint of economic theory, of 
course, salaries are generically wages. But in practice 



TABULATION 


17 


payments for relatively responsible and skilled work, 
contracted usually on the basis of a considerable period 
of time, and carrying some degree of stability of 
tenure, are classed as salaries. They are excluded 
from wage schedules as being presumably determined 
less directly by supply and demand considerations. 
Likewise most statistical units, however precise and 
simple they may appear at first glance, usually present 
many difficulties when they are applied to real condi- 
tions. Precisely what, for example, should be included 
in a tabulation as a book, a farm, an accident, a ton- 
mile f Almost any unit that may be chosen will be 
found to call for careful discrimination, and an exam- 
ination of current usage. 

A further difficulty is encountered when data con- 
cerning given units are being gathered and compared 
over a certain period of time, or from different con- 
temporaneous environments. It often happens that 
the definition of the unit varies at different times or 
in different places ; or perhaps the basis of estimating 
the frequencies may be altered, or comparisons may 
be invalidated by changing conditions. In our study of 
the Connecticut mill we may evidently assume that 
the basis of the classification has not changed materi- 
ally through the period studied, since the occupations 
classed as wage-earning are specified. But we might 
modify our interpretation of the change in wage levels 
upon observing that processes of work had altered, 
that the work-day was shortening, that child labor 
legislation was affecting the personnel, that the per- 
centage of female workers shifted from 28% to 19%, 
and then to 32%, or that the cost of living had fallen. 
Thus it is always necessary in comparative studies to 



18 INTRODUCTION TO ECONOMIC STATISTICS 

consider carefully both the environmental conditions 
and the statistical units employed. 

The statistician who is at all ingenious will discover 
many short cuts to lighten the work of tabulation. A 
method which is often useful is that of entering the 
original data on 3 X 5 or 4 X 6 cards. Suppose, for 
example, that we wished to classify the students of a 
given college according to their entrance grades, the 
class of school from which entering, fraternity mem- 
bership, and scholastic standing during their college 
course. A numbered card could be prepared for each 
student, and the desired data entered. The cards could 
then be sorted as desired, the sub-totals could be de- 
termined, or the data listed. If, however, such work 
is to be done on an extended scale, a tabulating ma- 
chine will be required. Such a machine automatically 
sorts and counts special cards on which the required 
data have been recorded by a keyed punch. The larger 
business houses are using machine tabulators increas- 
ingly; and of course extensive compilations like a cen- 
sus are prepared principally by machines. 

Library Work. The subject of tabulation has been 
extensively treated by writers on statistics. Day’s 
article, cited below, will prove to be very valuable to 
the student. It may be found reprinted in Secrist’s 
“Readings,” together with another excellent article 
on the same subject. Chapter IY of the same book is 
especially pertinent to the subject of wage, tabulations. 
Chapter YTI of Rugg’s text-book presents a concise 
description of the frequency curve. Machine tabula- 
tion is described in the circulars of the Tab ulatin g 
Machine Company, of New York. 



TABULATION 


19 


REFERENCES 

Bailey and Cummings, Statistics , Chapters I-V. 

Bowley, Arthur L., Elements of Statistics, Chapter IV. 

Day, E. E., “ Standardization of the Construction of Statis- 
tical Tables/ 9 Quarterly Publications frf the American 
Statistical Association, March, 1920, pp. 59-66. 

Koren, John, A History of Statistics . 

Rugg, H. 0., Statistical Methods Applied to Education, Chap- 
ter VII. 

Secrist, Horace, An Introduction to Statistical Methods , 
Chapters I-V. 

Secrist, Horace, Readings and Problems in Statistical Meth- 
ods, Chapters I-V. 

Yule, G. U., An Introduction to the Theory of Statistics , 
Chapter VI. 

EXERCISES 1 

1. Draw up tally sheets and frequency classifications for the 
years 1870 and 1880. 

2. Tabulate the twenty-five cent classes, showing both abso- 
lute and percentage figures, for the years 1870, 1880, and 
1891. Draw frequency polygons from the percentage 
data. 

3. Similarly tabulate and graph the fifteen cent classes. 

4. Graph the five cent classes for the years 1870, 1880, and 
1891. 

5. Draw rectangular histograms of the fifty cent frequencies 
for 1870 and 1880. Compare the relative advantages of 
the frequency polygon and the rectangular histogram. 

6. Classify and tabulate separately the female workers for 
the years 1870, 1880, and 1891. (Starred items — see foot- 
note, Table I.) 

7. Obtaining data from the Aldrich Report, study the wages 
paid in Establishment No. 86 in July of 1875 and 1885. 
Classify, tabulate, and graph as for the other years 
studied. 

8. Obtain the average scholarship grades for a selected group 
of students (100 or more), classify these grades, tabulate, 
and draw % frequency polygon. 

9. Toss two coins twenty-five times, keeping a record of the 
number of heads thrown at each toss. Classify and tabu- 
late the results, and draw a frequency polygon. Toss 
four coins fifty times, making similar records. What prin- 
ciple is illustrated! 

Before beginning & notebook the student should read Appendix I. 



CHAPTER II 

TYPES AND MEASURES OF DISPERSION 


After the frequency distribution of a given array 
has been presented in suitable form, there remains the 
task of finding simple numerical measures by which 
it may be summed up for purpose? of ready descrip- 
tion and comparison. The two features thus to be 
expressed are the typical wage and the degree of dis- 
persion or “spread” about the type. As a wage type, 
and a base from which dispersion may be measured, 
the common average, or arithmetic mean, will immedi- 
ately suggest itself. In the case of the wage rolls given 
in the preceding chapter, the average may be most con- 
veniently found by multiplying the wages, as tabulated 
at five cent intervals, by their respective frequencies. 
The sum of these products, divided by the number of 
workers, is the average. It is the weighted average of 
the class values at five cent intervals, since these values 
are emphasized in accordance with their frequencies. 
We shall find that weighted averages are sometimes 
taken in which the weights are derived estimates of 
the importance which should be attached to the values, 
respectively; but in this case the weightmg amounts 

simply to a summing up of the original wages. The 

2FV 

formula for the weighted average is — jj— (summa- 
tion of the frequencies times the class values, divided 

20 



TYPES AND MEASURES OF DISPERSION 21 


by the number of items). The accompanying table 
(Table IV), derived from Table II, shows the process 
of finding the average wage for the year 1891. 

TABLE IV 

WAGE BOLL AND AVERAGE WAGE 
CONNECTICUT MILL, JULY, 1891 


Single Wage 

No. of Workers 

Total Wage 

.50 

5 

$2.50 

.75 

10 

7.50 

.80 

16 

12.80 

.85 

3 

2.55 

.90 

8 

7.20 

1.00 

2 

2.00 

1.05 

2 

2.10 

1.10 

19 

20.90 

1.15 

17 

19.55 

1.20 

18 

21.60 

1.25 

28 

35.00 

1.30 

10 

13.00 

1.35 

19 

25.65 

1.40 

2 

2.80 

1.50 

11 

16.50 

1.60 

3 

4.80 

1.65 

1 

1.65 

1.70 

91 

154.70 

1.75 

67 

117.25 

1.85 

2 

3.70 

1.90 

5 

9.50 

2.00 

2 

4.00 

2.10 

3 

6.30 

2.20 

4 

8.80 

2.25 

1 

2.25 

2.50 

2 

5.00 

2.75 

4 

11.00 

3.00 

3 

9.00 

3.50 

1 

3.50 

4.00 

1 

4.00 

4.25 

1 

4.25 

Total 

Average 

361 

$541.35 

1.49958 

$1.50 


The Mode. In addition to the arithmetic mean, there 
are other types which the statistician uses in summar- 





22 INTRODUCTION TO ECONOMIC STATISTICS 


izing an array and measuring dispersion. One of these 
is the mode. The mode is applicable, however, only 
to frequency distributions which conform in their gen- 
eral outlines to the theoretical frequency curve. It is 
the value which lies at the point of greatest frequency, 
and in the normal curve is therefore identical with 
the average. In the graph of a frequency distribution 
it is easily recognizable as the value indicated on the 
horizontal scale at the point directly under the highest 
point of the curve. (See Figure 3, page 25.) 

The mode is particularly useful in connection with 
those frequency curves which, though conforming in 
general outlines to the theoretical, are extended more 
on the one side than the other. Such curves are said to 
be skewed. The wage data studied in the preceding 
chapter gives curves which are somewhat skewed to 
the right, so that a small secondary mode sometimes 
appears. But in their original five cent frequencies 
they do not give a smooth enough curve to allow of 
a very definite mode. When, however, such curves are 
strongly skewed, and are yet passably smooth; the 
mode is preferable to the average as a type of the 
array. 1 Suppose, for example, that in a wage array a 
few very large salaries are included. In such a case 
the average may fall between the wages and «the 
salaries, at a point where the frequencies are small. 
The mode, on the other hand, states the wage or salary 
most frequently paid. It is not affected by the skewed 
extreme of the curve; that is, by the relatively small 
number of large salaries. 

* Frequency curves that are strongly skewed to the right will some- 
times appear normal if transferred to semi-logarithmic paper, the 



TYPES AND MEASURES OF DISPERSION 23 


Determining the Mode. In frequency distributions 
that are irregular the mode may often be approxi- 
mately determined by a study of the larger frequency 
groupings. Let us take as an illustration the twenty- 
five cent frequency classes derived from the 1891 wage 
data. These are shown in the third column of Table V, 
the first and second columns being the data from which 
they are derived, as given in Table II. The latter part 
of the series is omitted, however, since it cannot affect 
the position of the mode. In the fourth and succeed- 
ing columns, variations of the twenty-five cent fre- 
quencies are formed by beginning the classes at differ- 
ent points in the scale. In each case the mode should 
lie somewhere within the class having the largest 
frequency ; that is, it should lie in the following classes : 

$1.50 — $1.70 
1.55 — 1.75 
1.60 — 1.80 
1.65 — 1.85 
1.70 — 1.90 

Since the only point common to these five classes is 
$1.70, this sum may be regarded as the mode. It will 
be seen, however, that this is the same value which 
would be taken as the mode on the basis of the five cent 
clasSes. The same result would in this case also be 
obtained by using the fifty cent classes. Ordinarily, 
this method is applied only to the largest classes which 
it is practicable to use in the frequency classification. 
It may give a quite different result from that which 
is obtained from the smallest classes. 

logarithmic scale being used as the base line. When this is the case, 
the distribution is normal on the basis of the geometric rather than 
the arithmetic mean (see page 94). 



24 INTRODUCTION TO ECONOMIC STATISTICS 


TABLE V 
THE MODE 

WAGE ROLL, CONNECTICUT MILL, JULY, 1891 


Y 

p 

FREQUENCIES IN 25c CLASSES 

VARIOUS GROUPINGS 

SUMMARY 

.40 

0 


5 




5 

.45 

0 



5 



5 

.50 

5 




5 


5 

.55 

0 





5 

5 

.60 

0 

5 





5 

.65 

0 


10 




10 

.70 

0 



26 



26 

.75 

10 




29 


29 

.80 

16 





37 

37 

.85 

3 

37 





37 

.90 

8 


29 




29 

.95 

0 



15 



15 

1.00 

2 




31 


31 

1.05 

2 





40 

40 

1.10 

19 

58 





58 

1.15 

17 


84 




84 

1.20 

18 



92 


1 

92 

1.25 

28 




92 


92 

1.30 

10 





77 

77 

1.35 

19 

59 





59 

1.40 

2 


42 




42 

1.45 

0 



32 



32 

1.50 

11 




16 


16 

1.55 

0 





15 

J5 

1.60 

3 

106M 





106 

1.65 

1 


162M 




162 

1.70 

91 



162M 



162M 

1.75 

67 




161M 


161 

1.80 

0 





165M 

165 

1.85 

2 j 

74 





74 

1.90 

5 1 


9 




$ 

1.95 

0 



9 



9 

2.00 

2 




10 


10 

2.05 

0 





5 

5 

2.10 

3 

9 





9 

2.15 

0 


8 




8 

2.20 

4 



8 



8 

2.25 

1 




5 


5 

2.30 

0 





5 

5 

2.35 

0 

1 





1 

2.40 

0 







2.45 

0 







etc. 













TYPES AND MEASURES OF DISPERSION 25 


As applied to the given wages, however, the fore- 
going method of locating the mode is objectionable. 
The $1.70 frequency is relatively so large that in each 
classification it determines the mode without allowing 
due weight to the large frequencies a little further 
down the scale. The latter might be considered a sec- 
ondary mode, but we shall here assume that the use of 
more extensive data would result in a single mode. 
We may therefore illustrate the use of a method which 



Figure 3. Rectangular histogram of wages in a Connecticut mill, 
1891, fifty cent classes (solid line), and smoothed frequencies (broken 
line). 

is applicable to irregular frequencies, or to data pre- 
sented in only a few large classes. It will, of course, 
be understood that with such limited data no very de- 
pendable result can be obtained. 

In using this method, we shall not only approxi- 
mately determine the mode, but draw the smoothed 
curve as well. The method is illustrated in Figure 3. 
The frequencies are first represented by a rectan- 
gular histogram. In drawing the histogram the ver- 
tical lines should theoretically be drawn at a point 


26 INTRODUCTION TO ECONOMIC STATISTICS 


midway between the two adjacent class limits; for 
example, the line separating the first and second 
classes falls at .475. 1 It is evident that the mode lies 
in the class $1.50 to $1.95, but it is desirable to locate 
it somewhat precisely within the class. This may be 
done by dividing the class interval into two parts pro- 
portional inversely to the adjacent frequencies. 2 In so 
doing the class limits are considered $1,475 and $1,975, 
as drawn. The division may readily be constructed 
geometrically, or it may be computed as follows : 

10 

$1,475 -| L_ x $.50 = $1.51 

117 + 10 

The formula for this operation is, 

F 

M — L, -f X C 

Fm “(“Fn 

in which 

M = the mode. 

L, == the lower limit of the modal class. 

Fm and F n = frequencies adjacent to the one contain- 
ing the mode, in the order named. 

C = the class interval. 

Smoothing the Frequencies. After the mode has 
been determined, the histogram may be smoothed into 
a frequency curve. This curve is drawn to conform 
as closely as possible to the theoretical bell-shaped 
curve; and yet to maintain, frequency by frequency, 
the same area as the original rectangular figure. Thus 
in the drawing A x = A 2 -f- A 3 , B t = B 2 ,‘and C, = C 2 . 
The curve culminates approximately at the mode as 

1 The class limits should be so placed that the items in the modal class 
average close to the mid-point or the class. 

•It is sometimes preferable to take the average of the items in the 
modal and adjacent classes, or to include more classes. 



TYPES AND MEASURES OF DISPERSION 27 


previously determined, but the height is merely esti- 
mated with reference to the required area. As thus 
drawn, the curve presents an estimate of the probable 
distribution of the economic values expressed by wage 
rolls of the type studied. The irregularity of the data, 
however, makes it far from typical. 

The Median. Another type often used to represent 
a given array is the median. The median is the value of 
the middle item in an array. The number of this item 

N + 1 1 

is found by the formula, ; and its value may be 

2 

determined by reference to the summation column of 
the frequency table. For example, in the 1891 wage 
data the median item is number 181, and its value as 
determined by means of the summation column is 
$1.70. That is, the 181st item falls within the $1.70 
class. In case the median item should prove to be 
fractional, and should fall between two frequencies, the 
median would not be precisely determined. Suppose, 
for example, that the median item had been number 
174J4. In such a case the median would lie between 
the limits $1.65 and $1.70. 

Comparison of the Three Types. Theoretically, in a 
frequency curve having very small class intervals the 
median value is indicated at the foot of a perpendic- 
ular line which bisects the area of the curve. The aver- 
age, on the other hand, would lie at the foot of a similar 
perpendicular which would balance as an axis the 
weight of the two sides, supposing the area of the curve 

1 The unit is added to counterbalance the space from 0 to 1. Or, the 
formula may be considered as an expression of the average of the 
extreme ordinals of the array. 



28 INTRODUCTION TO ECONOMIC STATISTICS 


to have been cut out of a material of uniform weight. 
When skewness is regular, the mode, median, and 
average are located on the value scale in the order 
named, the intervals separating them being in the ratio 
of about two to one. In the normal curve the three 
types are identical. 1 

The use of the average, median, and mode as types 

SALARIES PAID IN REPRESENTATIVE UNIVERSITIES AND 
COLLEGES IN THE UNITED STATES IN 1919-20. 

Public institutions 


TITLE OP POSITION 

NUMBER OP 

PERSONS 

MINIMUM 

SALARY 

MAXIMUM 

SALARY 

President or chancellor 

77 

$2,500 

1,200 

300 

$12,500 

10,000 

Dean or director 

367 

Professor 

2,460 

822 

lojooo 

4,000 
a non 

Associate professor 

300 

Assistant professor 

1,705 

500 

Instructor 

2,138 

855 

300 

■m 

Assistant 

75 




AVERAGE 

SALARY 

MEDIAN 

SALARY 

MOST 

FREQUENT 

SALARY 

President or chancellor 

$6,647 

$6,000 

$6,000 

Dean or director 

3,819 

3,500 

3,000 

Professor 

3,126 

3,000 

3,000 

Associate professor 

2,514 

2,500 

3,000 

Assistant professor 

2,053 

2,000 

1,800 

Instructor 

1,552 

1,500 

1,500 

Assistant 

801 

750 

1,200 


1 Two other forms of the average are sometimes used in statistical 
work. One is the geometric mean. This may be found by averag- 
ing the logarithms of the numbers instead of the numbers themselves. 
It is the nth root of the product of the numbers. Some statisticians 
advocate the use of the geometric mean in finding the average periodic 
change in prices. Considered merely as an average of prices apart from 
the use of weights, the geometric mean is logically correct because it 
measures ratios of divergence rather than absolute amounts. The other 
type of average is the harmonic mean, which has° occasionally been 
applied to the same purpose. For two numbers, a and b, it is computed 

by the formula - In arithmetic this is the formula which is used to 

af b 

find an average rate of travel when two rates for two equal distances are 
given. In general, it may be described as the reciprocal of the average 
of the reciprocals of the given numbers. 






TYPES AND MEASURES OF DISPERSION 29 


may be illustrated by the foregoing table which, was 
issued by the United States Bureau of Education and 
reprinted in the Monthly Labor Review of January, 
1921. In this table the term “most frequent salary” 
signifies the mode. 

Quartile Deviation. The types we have now consid- 
ered are used as the basis for measuring dispersion, 
though the average is more commonly employed than 
the other two. The simplest measure of dispersion is 
related to the median, and is called the quartile devia- 
tion. This is found by computing the value of the first 
and third quartiles of an array. The quartiles are 
analogous to and include the median, being located at 
the quarter divisions of the array. The location of the 

first quartile is found by the formula and of 

the third quartile by the formula ^ — — . The values 

of these items are determined by reference to the fre- 
quency table in the same manner as the median value 
was determined. The second ' quartile is, of course, 
identical with the median. The quartile range is found 
by subtracting the value of the first quartile from the 
value of the third, and the quartile deviation is half 
of tjais difference. It may be seen by reference to 
Figure 1 that the quartile deviation as thus found is 
simply the average distance between the median and 
the adjacent quartiles, as measured on the base line. 
In the 1891 wage roll the first quartile is item No. 90%, 
and its value is $1.20. The third quartile is No. 271%, 
and its value is $1.75. The quartile range is therefore 
$.55, and the quartile deviation is $.28. This means 



30 INTRODUCTION TO ECONOMIC STATISTICS 


that half the workers receive wages that fall within a 
range averaging twenty-eight cents above and below 
the median ; that is, between $1.20 and $1.75. 1 

For purposes of comparison the quartile deviation 
should usually be reduced to a percentage basis. This 
is done by dividing it by a value regarded as typical of 
the array. Since the quartile deviation is related to 
the median, it would appear logical to take this type 
as a base. But it is customary to take instead a point 
lying midway between the first and third quartile 
values ; that is, the average of the two. In a perfectly 
regular curve this value would naturally be identical 
with the median. The reason for taking this base is 
obviously that it is the point from which the quartile 
deviation is assumed to be directly measured. The 

Q 3 +Qi 

formula for it is (third quartile plus first 

2 

quartile, divided by two). The formula for the quartile 

Q 3 Qi 

deviation is . The latter divided by the 

2 

Qa — Qi 

former is , which is therefore the formula for 

Qa+Qi 

the coefficient of quartile deviation. 

Interpolation. Before leaving the quartiles, a method 
of locating them by interpolation between the class 
intervals should be described. We will illustrate the 
method by applying it to the 1891 wage data, though 

1 The quartile deviation is also called the 1 i probable error” of a 
frequency distribution. The term is derived from the * 1 Theory of 
Errors / 1 and connotes the central range of the distribution within 
which an item added to the series will have an even chance of fall- 



TYPES AND MEASURES OF DISPERSION 31 


in fact the five cent classes give quartile values precise 
enough for most purposes. The type of problem to 
which the method is best adapted is one in which an 
array as given is classified only in a few large groups. 
But supposing that it is desirable to know the quartile 
values very precisely in the 1891 wage data, we may 
find them as described below. 

In assuming that we may interpolate at any point 
between the items of the original frequency classifica- 
tion, we are evidently regarding the series as con- 
tinuous rather than discrete. In the case of the wage 
roll we shall be dealing, then, with the theoretical 
economic values underlying the wages as paid. The 
actual frequencies are therefore to be considered as 
indicating proportionate numbers, which may be in- 
creased indefinitely as in the case of any multiple ratio. 
The original discrete five cent classes are now to be 
considered as having continuous class intervals; for 
example, a fifty cent wage is taken to indicate an 
economic value within the limits $.475 and $.525. ' The 
frequencies are assumed to be equally spaced between 
these limits. 

When about to interpolate, we locate the first 

N N + l 

quartile by the formula — , instead of as in the 

4 4 

previous case. The second and third quartUes are 
two and three times this number, respectively. The 
reason for otnitting the unit in the formula, and for 
ignoring it also in an analogous formula for sub-divid- 
ing the class, is that our hypothesis of a continuous 
scale renders the unit of negligible value. It is as if 



32 INTRODUCTION TO ECONOMIC STATISTICS 


the items were regarded as being groups of thousands 
or millions; that is, as if they were indefinitely sub- 
divisible. Having located the quartile items, we find 
the class in which they fall by inspection of the fre- 
quency table, as before. We next find the position of 
the quartile within the class; that is, the fraction of 
the interval that it is advanced beyond the lower limit. 
The corresponding value is then determined. The 
process is the same as that used in interpolating in 
logarithmic or other tables. 

In the 1891 wage data, the first quartile is located at 
item 90%. This item falls in the class having theo- 
retically a lower limit of $1,175 and an upper limit of 
$1,225. The preceding class ends with the 82nd item, 
and the quartile is therefore advanced 8% items in its 
own class of 18 items. This advance is 8% 18, or .46 

of the class interval. This fraction of the class interval 
of $.05, is $.023, which, added to the lower limit of the 
class, gives the quartile value of $1,198. Read to the 
nearest cent, this value happens to be the same as that 
obtained without interpolation. 

The process may be summed up as follows : 

I 

Q = L x + — . C 
F 

in which, 

Q = Quartile value 

L, = Lower limit of class containing quartile 

I = Quartile item minus last item of preceding 
class 1 

F = Frequency of class containing quartile 

C = Interval of same class 


^‘Item” here refers to the number, not the value. 



TYPES AND MEASURES OF DISPERSION 33 


Quartile Dispersion. A statement of the quartiles 
and the highest and lowest wage paid serves to give a 
fairly good idea of a frequency distribution even with- 
out any further computation of precise measures of 
dispersion. In Table VI such a statement is presented 

RANGE AND TREND OF WAGES, 1870-1891, IN A CONNECTICUT 
WOOLEN MILL 



Figure 4. Semi-logarithmic, or ratio paper 


for the wage data of 1870, 1880, and 1891, together 
with the quartile deviations and their coefficients. The 
table includes the interpolated values, although, as has 
been intimated, their computation is hardly worth 
while here except as an illustration of the method. In 
Figure 4 the discrete quartiles and limits are shown 



34 INTRODUCTION TO ECONOMIC STATISTICS 


graphed upon semi-logarithmic paper. The vertical 
scale of this paper is similar to the scale of a slide- 
rule, hence the ratio of periodic change may be com- 
pared by means of the slant of the lines connecting 
the values. To be complete, however, such graphic 

TABLE VI 


BANGE OP DAILY WAGES IN A CONNECTICUT MILL— SCALE 
TAKEN BOTH AS DISCRETE AND CONTINUOUS 


WAGE 

ARRAY 

1870 

1880 

1891 

DIS- 

CRETE 

CONTINU- 

OUS 

DIS- 

CRETE 

CONTINU- 

OUS 

DIS- 

CRETE 

CONTINU- 

OUS 

Lower Limit 

nm 

$ .38 

$ .40 

$ .38 


$ .48 

1st Quartile 

KbSEi 

1.09 

.95 

.93 


1.20 

2nd Quartile 

wm 


mwm 

1.19 


1.68 

3rd Quartile 

1.50 

1.51 

1.25 

1.24 

1.75 

1.73 

Higher Limit 

3.75 

3.78 

3.25 

3.28 

4.25 

4.28 

Q. Deviation 

.20 

.21 

.15 

.16 

.28 

.27 

—coefficient 

15% 

16% 

14% 

14% 

19% 

18% 


representation should show at least annual data. It 
might also very well show decile points; that is, the 
wages occurring at the tenths instead of the quarters 
of each array. 

The Ogive. A convenient method of presenting an 
array and at the same time of graphically determining 
the quartiles, is shown in Figure 5. The construction 
is based upon the assumption of a continuous serieu of 
values, and parallels the procedure of finding the quar- 
tile values by interpolation. The frequencies are 
plotted from the summation column of c the original 
five cent classes. For convenience of interpretation, a 
dot marks the entry as it would be made on the graph 
if the series were taken as discrete. A slanting line is 
drawn across each class interval, beginning with the 
summation total of the preceding class, and ending 













TYPES AND MEASURES OF DISPERSION 35 

with the summation total of the given class. The given 
frequency is thus represented as distributed evenly 
through the class. The resulting figure is known as 
an ogive. To find the quartiles, the vertical scale rep- 
resenting the whole array is divided into four equal 
parts, and horizontal lines are drawn from the quartile 
division points until the ogive is intersected. From 



the points of intersection perpendiculars are drawn 
to the base line. The foot of each perpendicular marks 
upon the horizontal scale one of the quartile values. 
The deciles may be found by dividing the horizontal 
scale into tenths, and proceeding as before. This 
graphic process, worked out on large sheets of cross- 
section paper, is usually the most convenient method 
of finding the quartiles or deciles. 1 

* A more complex form of the ogive has recently been introduced 
for testing the regularity of a frequency distribution. This ogive la 


36 INTRODUCTION TO ECONOMIC STATISTICS 

Average and Standard Deviation. We shall now 
consider the more commonly used mathematical 
measures of dispersion, — the average deviation and the 
standard deviation. The former is coming to be fairly 
well known as applied to economic data. The latter is, 
however, generally favored by the mathematician, but 
its chief statistical use at present lies in connection 
with the measurement of correlation, a subject which 
will be taken up later. 

The principles involved in average and standard 
deviation may best be illustrated by taking a very 
simple example. Suppose that four workers are em- 
ployed at daily wages of $2.00, $6.00, $7.00 and $9.00, 
respectively. The average wage is $6.00. The first 
wage differs from the average by $4.00, the second is 
at the average, the third differs by $1.00, and the 
fourth differs by $3.00. The sum of these differences 
is $8.00, or an average of $2.00 for each wage. The 
average deviation (A. D.) is therefore $2.00, which may 
be taken as a measure of the “spread” of the wages. 1 
The standard deviation (<r) is computed by squaring 
the deviations, averaging the squares, and finding the 
square root of this result. In each case a coefficient 
may be found by dividing by the average wage. The 
computations are written out in the following form! 

drawn upon so-called probability paper, and is constructed from the sum- 
mated percentage frequencies. The vertical scale of the probability 
paper is so graduated that a normal curve will form a straight line 
diagonally across the paper. The divergence of a given distribution 
from normal may be estimated by its departure from a straight line. 
The paper for this graph, as well as for other statistical work, may be 
obtained from the Codex Book Company of New York, or from other 
publishers of statistical material. 

x The total spread, or range, from the highest to the lowest wage is 
sometimes given as an inexpert measure ot dispersion. But it is of 
little value because the wage limits are set by single items which 
have only a haphazard relation to the rest of the array. The average 
and standard deviations,’ however, take account of all the items. 



TYPES AND MEASURES OF DISPERSION 37 


AVERAGE DEVIATION i 

I STANDARD DEVIATION 

V 

D 

V 

D 

D* 

$2 

$4 

$2 

$—4 

le 

6 

0 

6 

0 

0 

7 

1 

7 

1 

i 

9 

3 

9 

3 

9 

— — 

*— 

— 

— 

j— 

4 ) 24 

4 ) 8 

4 ) 24 

0 

4 ) 26 

A = 6 

A. D. = 2 

2 

Coef. ==—=33% 
o 

A = 6 

Coef. = 

a* = 6.5 
a =2.55 

-7 -«* 


A practical application of average deviation may be 
cited from Dewing, “Corporation Finance,” Yol. III. 
The writer states that the earnings of corporations pro- 
ducing inexpensive necessities, directly consumed, are 
most regular; while the earnings of corporations pro- 
ducing expensive indirect goods are least regular. He 
illustrates the two types of corporations by the 
Diamond Match Company and the American Locomo- 
tive Company, computing the average deviations of 
the net earnings of the two companies. The coefficient 
in the first case is 7.1%, and in the second case 50%. 
The dispersion might have been measured in other 
ways, as by the standard deviation, but the quartile 
measure would not be applicable to such a small 
number of unclassified deviations. 

The quartile, average, and standard deviations do 
not give the same results, as they measure progres- 
sively larger portions of the frequency curve (see Fig. 
1, p. 10). But in regular distributions, a comparison 
will be the same whichever measure is used to make 
the comparison. In an irregular distribution which 
has a few extreme items, the standard deviation will 
give an exceptionally large result, since the process of 
squaring the deviations emphasizes these extremes. 



38 INTRODUCTION TO ECONOMIC STATISTICS 

A Short-cut Method. The work of finding the aver- 
age or standard deviation is often rather tedious, par- 
ticularly when the average is expressed as a decimal. 
In finding the former, however, it is not important that 
the average should be expressed very precisely, since 
the slight error involved in cutting short a decimal is 
minimized in the process of the work. And in finding 
the standard deviation, a short-cut process may be 
used. In this process a convenient average is assumed, 
and a correction is made later. The method of making 
the correction may be illustrated by the simple wage 
scale of four items previously used. The average 
wage, from which the deviations are to be measured, 
will be assumed to be $7. Needless to say, this assump- 
tion is not here advantageous, though it would have 
been if the average had been, let us say, $6.75. 
When the deviations are measured from the assumed 
average of seven, they give an algebraic sum of — 4, 
which results from the fact that the “correction” 
appears once in each deviation. Hence the algebraic 
sum of the deviations, divided by the number of items, 
will give the “correction,” — the term being taken to 
mean the sum which must be added to the assumed 
average to make it the exact average. The correction 
(K) when found is added to the assumed average,' and 
its square is subtracted from the average squared 
deviation. By so doing, whatever error may have been 
involved in assuming an average is eliminated. Other- 
wise, the work is as before. The computation is set 
down as indicated at the top of the next page. 

Deviation Computed from Frequency Tables. The 
illustration that has been considered thus far in the 



TYPES AND MEASURES OF DISPERSION 39 


V 

D 

D 2 

2 

—5 

25 

6 

—1 

1 

A x = 7 

0 

0 

9 

2 

4 


4)— 4 

4) 30 


K = — 1 

7.5 


Ax = 7 

K 2 = 1 


A = 6 

<T 2 = 6.5 


a — 2.55 

Coef. = ^ = 43% 

discussion of average and standard deviation, has been 
simplified by the fact that the array is not classified 
into frequency groups. When computed from a fre- 
quency table, the average and standard deviations 
require a somewhat more complex process, though in 
fact the principles involved are precisely those already 
explained. The one point to be observed is that the 
class values must be multiplied by their respective 
frequencies in order that all items may be taken into 
account. The method is shown in Table VII, where the 
wage data of 1891 are again taken up. 1 In these com- 
putations, the class values are taken at approximately 
the mid-points of the class intervals. This is the usual 
procedure, though a small error is introduced by so 
doing. An exact computation would require that thd 
actual average value of each class, as determined by 
dividing the total wages of the class by its frequencies, 
should be substituted. 

1 A complex graph of a frequency distribution, known as the Lorenz 
curve, may be described in connection with the data of Table VII. 
This graph is based upon the F and FV columns, aa shown under 



40 INTRODUCTION TO ECONOMIC STATISTICS 



TABLE VII 




AVERAGE DEVIATION AND STANDARD 

DEVIATION 


WAGE ROLL IN A CONNECTICUT MILL, 

, JULY, 1891 



AVERAGE DEVIATION 


V 

F 

FV 

D 

FD 

$ .75 

42 

$31.50 

$ .78 

$32.76 

1.25 

117 

146.25 

.28 

32.76 

1.75 

180 

315.00 

.22 

39.60 

2.25 

10 

22.50 

.72 

7.20 

2.75 

6 

16.50 

1.22 

7.32 

3.25 

3 

9.75 

1.72 

5.16 

3.75 

1 

3.75 

2.22 

2.22 

4.25 

2 

8.50 

2.72 

5.44 


361 

) 553.75 


361 ) 132.46 



A = 1.53 


A.D. = .367 





.367 





Coef . = == 24% 

Average Deviation. 

The two columns are 

reduced 

to percentages and 

summated, giving the following results: 




Upper limit 





of class 

F (2) 


FV (2) 


$1.00 

11.6% 


5.7% 


1.50 

44.0 


32.1 


2.00 

93.9 


89.0 


2.50 

96.7 


93.1 


3.00 

98.3 


96.1 


3.50 

99.2 


97.8 


4.00 

99.4 


98.5 


4.50 

100.0 


100.0 


The two summated columns are then plotted as coordinates, the first on 
the horizontal scale and the second on the vertical scale. If the wages 
were all alike, a direct diagonal would result, while disparity of ‘wages 
registers in the concavity of the line. The use of the five cent classes 
would give a more accurate representation. The curve has been often 
used for presenting a comparison of the distribution of wealth or income 
in different countries. 






TYPES AND MEASURES OF DISPERSION 41 


STANDARD DEVIATION (Assumed Average = $1.75 ) 1 


Y 

F 

D 

FD 

FD* 

$ .75 

42 

$- 1.00 

$-42.00 

$42.00 

1.25 

117 

• -.50 

-58.50 

29.25 

1.75 

180 

0 

0 

0 

2.25 

10 

.50 

5.00 

2.50 

2.75 

6 

1.00 

6.00 

6.00 

3.25 

3 

1.50 

4.50 

6.75 

3.75 

1 

2.00 

2.00 

4.00 

4.25 

2 

2.50 

5.00 

12.50 


361 


361)— 78.00 

361)103.00 


K = - .216 .2853 

A* = 1.75 K* = .0467 


A = 1.534 ff *= .2386 
<X = -4» 

49 

Coef . j 53 = 32 % 

Formulas. The formulas for average and standard 
deviation are as follows : 


A. D. 


2FD 

N 


(the deviations here considered pos- 


itive) 

S.D. („) =,|/5™?_K“ 
in which 

F = Frequencies 

D = Deviations (from assumed average if followed 
by a correction) 

N = Total number of items in array. 

K = Correction for error in assumed average — 
2FD 
N 

If an average of zero is assumed, the second formula 
becomes : 


S. D. 



2FV 2 

N 



a 


1 A column showing D* is often included, but in most cases it may be 
omitted and FD* obtained by multiplying D x FD. 



42 INTRODUCTION TO ECONOMIC STATISTICS 


In some cases, particularly where a calculating ma- 
chine is used, this modification of the formula will be 
found desirable. It calls for the computation of only 
the columns FV and FV 2 . Its use may be illustrated 
by reference to Table IV, p. 21. If the first and third 
columns are multiplied across and totaled, a result of 
$890.50 will be obtained. This is 2FV 2 . Dividing by 


N and subtracting the square of the average, 



gives $.218, the square root of which is $.47. This is 
the standard deviation, obtained somewhat more accu- 
rately than before, since smaller class intervals are 
taken. The modified formula will often be found 
useful in connection with time series, where no fre- 
quencies are involved. 

Summary. In Table VIII a final summary is made 
of the dispersion of wages in the Connecticut mill here 
.studied. It is, of course, obvious that the use of a 
variety of measures is for purposes of illustration 
only. In practical work of this sort only one measure 
would be used, probably either the quartile or the 
average deviation. Skewness would doubtless be com- 
pared merely by an inspection of the frequency poly- 
gons. Our more exhaustive study, however, gives a 
very precise picture of the dispersion of wages.* On 
the whole, the “spread” lessens somewhat after 1870, 
though the change is not great enough to render the 
data incomparable. It will be noted that the relative 
skewness of the curves changes but little. Since, then, 
the dispersion of wages does not materially change, the 
average wage may be safely taken as an index of the 
periodic wage level in the given mill. 



TYPES AND MEASURES OF DISPERSION 43 


TABLE yin 


DISPERSION OP WAGES, CONNECTICUT MILL, JULY, 1870, 1880, 

AND 1891 


MEASURE 

ABSOLUTE 

RELATIVE 







Quartile Deviation 


$ .16 

$ .27 

16% 

14% 

18 % 

Average Deviation* 


.28 

.37 

31 

22 

24 

n “ (exact) 

.41 

.24 

.35 

mSM 

20 

23 

Standard Deviation* 

.62 

K31 

.49 

44 

40 

32 

tf u (exact) 

.61 

.46 

.47 

44 

38 

31 

Skewness* 

.71 

.64 

.55 

114 

128 

113 


* Commuted from fifty cent classes; V = mid-point of class. 


Measurement of Skewness. The average deviation, 
as we have seen, uses the first power of the deviations, 
while the standard deviation uses the second power. 
If in an analogous way the third power is used, a 
measure of skewness is obtained. A mathematical 
measure of this sort is, however, seldom required in 
economic statistics. The relative degree of skewness 
may be roughly determined by comparing the outlines 
of frequency curves, or by noting the position of the 
modes relative to the averages or medians. But if an 
accurate measure is desired, the following formula 
should be employed : 


Skewness = 

The measure may be reduced to a coefficient by divid- 
ing by the standard deviation. 

Library Work. The subjects of types and disper- 
sion are treated in great detail in several of the 
standard text-books, such as Bowley’s and Yule’s. The 


V 


S FD 8 

~W 












44 INTRODUCTION TO ECONOMIC STATISTICS 


most complete work on the former subject is that of 
Zizek, cited below. For an exposition of graphic repre- 
sentation, the student should not fail to consult Brin- 
ton’s work, and the Statistical Atlas of the United 
States published in connection with the census. The 
ratio chart (semi-logarithmic, or “arith-log,” paper) 
is well treated in an article by Irving Fisher, as well 
as in an article by J. A. Field reprinted in Secrist’s 
“Readings.” Whipple’s text-book gives an explana- 
tion of the use of probability paper as a means of 
presenting and testing frequency curves. King’s text- 
book, page 156, explains the Lorenz curve. 

REFERENCES 

Bowley, Arthur L., Elements of Statistics, Chapters Y-YII. 
Brinton, W. C., Graphic Methods for Presenting Facts. 

Fisher, Irving, “The Ratio Chart,” Quarterly Publications 
of the American Statistical Association, June, 1917, pp. 
577-601. 

King, W. I., Elements of Statistical Method, Chapters XII- 
XIV. 

Marshall, Wm. C., Graphical Methods, Chapters I-III. 

Secrist, Horace, Readings and Problems in Statistical 1 Meth- 
ods, pp. 282-305. 

Whipple, G. C., Vital Statistics, Chapter XII. 

Yule, G. U., An Introduction to the Theory of Statistics, 
Chapters VII and VIII. 

Zizek, Franz, Statistical Averages. 

EXERCISES 

1. Using the five cent frequencies and classes, find the aver- 
age wage for 1870 and 1880. 

2. Using 25c class intervals, determine the mode for 1870 
and 1880, following the process illustrated on page 24. 

3. Using 50c class intervals, similarly determine the mode 
for 1870 and 1880. 

4. Using the 50c frequencies, determine by a mathematical 



TYPES AND MEASURES OF DISPERSION 45 


5 . 

6 . 


7. 


8 . 

9. 

10 . 

11 . 

12 . 

13. 


14 . 


formula the position of the mode in the 1870 and 1880 
data. Draw rectangular histograms and smooth them. 
Explain why different values for the mode are obtained 
in the two preceding exercises. Which results are the 
more valid? Why? 

Find the quartile items and their values in the 1870 and 
1880 wage data by interpolating in the 5c classes. Com- 
pute the quartile deviations and coefficients. 

Draw ogives of the wage data for 1870 and 1880 — 5c 
frequencies — showing the quartile values. 

Summate the percentage frequencies from Table III, page 
13, and plot on probability paper. 

Find the average deviation and coefficient for the 1870 
and 1880 wage data, 50c classes. 

Find the standard deviation and coefficient for the 1870 
and 1880 wage data, 50c classes, using the method in- 
volving an assumed average and correction. 

Using data prepared in Exercise 9, draw Lorenz curves 
of wage distributions in 1870 and 1880. Draw a similar 
curve from the data of Table IV, page 21. 

Apply the modified formula for standard deviation (page 
41) to the five cent wage data for 1870 and 1880. 

During a certain period the rate of bank discount was 
as follows : 


Rate per cent 

21/2 

3 

3y a 

4 

4 % 

5 

5y 2 

6 
7 


No. of days 
174 
408 
132 
165 

36 

37 
20 
26 

2 


(a) Compute the average rate of discount, taking the 
number of days as the frequencies. 

(b) In what classes (rate per cent) do the quartiles 
fall ? 

(c) What rate per cent may be taken as the model 

Why? 

(No interpolation is required in the above problem.) 
Find the coefficients of average deviation, standard devia- 
tion, and skewness for the following frequency distribu- 



46 INTRODUCTION TO ECONOMIC STATISTICS 


tion. Locate the quartiles by interpolation and check the 
results by means of an ogive. 

V F 

$1 1 

2 3 

3 2 

4 2 

5 1 

6 1 

15. The following table shows the increase in the cost of 
living for ten cities from December, 1914, to December, 
1920. Classify these percentages to the nearest multiple, 
of five, and find the average deviation. (Bureau of 
Labor Statistics’ data.) 


Boston 97.4 

Buffalo 101.7 

Chicago 93.3 

Cleveland 104. 

Detroit 118.6 

Los Angeles 96.7 

New York 101.4 

Philadelphia 100.7 

San Francisco 85.1 

Seattle 94.1 


16. Apply the arithmetic formula for determining the mode 
to the following three time series. The months jnay be 
treated as if they were class intervals, and the per- 
centages as if they were frequencies. Graph each series 
and construct a smoothed curve : 

Percentage of crops harvested monthly in the United 
States, as reported by Department of Agriculture. t 


Month Wheat Corn Cotton 

May 0.5 — — 

June 22.0 0.1 — 

July 42.3 0.1 1.4 

Aug 28.4 1.5 11.5 

Sept 6.5 15.8 31.6 

Oct 0.3 28.3 34.4 

Nov — 43.3 16.0 

Dec — 10.9 4.7 

Jan.-April — — 0.4 



CHAPTER III 


INDEXES OF WAGES AND PRICES 

The Nature of Indexes. A large part of statistical 
work concerns itself with the making and interpreting 
of indexes. By an index is meant a number, whether 
absolute or relative, which is used in comparisons to 
measure a given condition. Used collectively, the term 
implies a series of such indexes, forming a multiple 
ratio. Practically all indexes are compiled by a 
process of sampling. Thus, though it is impossible to 
record any large proportion of actual wages and prices, 
yet it is possible to estimate changes in the wage or 
price level by the use of well-selected samples. Just 
what may be regarded as sufficiently complete data in 
sampling cannot be determined precisely, but must be 
judged largely on the basis of experience. Straws 
show which way the wind blows, and likewise the price 
of a product in a single locality will often accurately 
reflect the trend of a world market. There is no 
dependable uniformity, however. Some prices respond 
quickly and universally to changes in supply or de- 
mand, while others move slowly and irregularly. With 
respect to wages, it is commonly observed that the 
market is somewhat slow in its movements. In an 
industrial center like New England, however, the 
market should be fairly responsive. One might never- 
theless hesitate to take the wages at a single mill as 

an index of wages for the whole country; but com- 

47 



48 INTRODUCTION TO ECONOMIC STATISTICS 


parisons will show that such an index would, in fact, 
have some degree of reliability. 

Indexes of Wages. The wage averages considered 
in the preceding chapter will be taken provisionally as 
indexes of the wage level in the United States. Accord- 
ing to these indexes, daily wages in 1870 stood at $1.38, 
they fell by 1880 to $1.21, but climbed by 1891 to $1.50. 
The changes may be presented more clearly, however, 
if the figures are reduced to another form. Since in- 
dexes are used as ratios, they may be multiplied or 
divided through by any factor to suit given require- 
ments. If in this case they are divided by $1.38, the 
wage in 1870, they are said to be reduced to a base of 
1870, since the index for that date will become 100. 1 
Expressed literally, the result is 100%, but the per 
cent sign is usually dropped as being unnecessary in 
a ratio. The index numbers now read: 


Year. 

Wage. 

1870 

100 

1880 

88 

1891 

109 


Indexes of Real Wages. In a study of changes in 
the wage level, a further factor of great importance 
must be taken into consideration. This factor is the 
cost of living. Changes in the cost of living affect the 
prosperity of wage earners inversely. Hence “real 
wages” — a term denoting the purchasing power of 
wages — will be measured by money wages divided by 

1 The base which is theoretically best to use in deriving an index from 
absolute numbers, is an average of the numbers. Its advantages are, 
first, that it is a stable value from which to measure the items, and 
second, that each index is made to suggest its relative position in the 
series. Applied to the given wage data, the base becomes the average 
wage of 99c, and the indexes become 101, 89, and 110 — each expressing a 
percentage of the average. 



INDEXES OF WAGES AND PRICES 49 


prices. Whether absolute or relative numbers are 
taken to measure wages and prices, the quotients may 
be regarded as comprising an index of real wages, and 
may be reduced to any desired base. If wages and 
prices are expressed as indexes having the same base, 
then the resulting index of real wages will also have 
this base. 

Various Wage Indexes. The accuracy of our provi- 
sional wage index may be tested and the study ex- 
tended, by the introduction of wage data of a more 
general character. These data will be taken from three 
sources: (1) “The Movement of Wages in the Cotton 
Manufacturing Industry of New England,” by Pro- 
fessor Stanley E. Howard; (2) The Aldrich Report, 
and (3) the publications of the Bureau of Labor Sta- 
tistics, of the Department of Labor at Washington. 
The first of these sources gives a carefully prepared 
index of weekly wages in the Massachusetts cotton 
manufacturing industry from 1860 to 1914. It is de- 
rived in part from the Aldrich Report, and uses the 
principles of tabulation and measurement already 
explained. The second gives a general index of wages 
down to 1891, based on wages from many industries, 
and covering various sections of the country. The 
thifrd source furnishes an index of hour rates in the 
United States from 1840 to 1920. Hour rates, of 
course, are not entirely satisfactory as a basis for an 
index of actual earnings because of the gradual reduc- 
tion in the length of the working day. This reduction 
has, however, been offset by increased over-time pay 
at higher rates, by a gain in leisure hours, and prob- 
ably by more regular employment. And as a matter 
of fact, the index of hour rates will be found to con- 



50 INTRODUCTION TO ECONOMIC STATISTICS 

form somewhat closely to the index of weekly wages. 
As measuring the cyclical changes in wages, both in- 
dexes give practically the same results. 

In Table IX the wage indexes from the sources just 
mentioned are shown for the years 1870, 1880, and 
1891, together with the indexes already derived. By 
means of price indexes — the nature of which will be 
considered later — nominal wages are reduced to real 
wages. The indexes of both nominal and real wages 


TABLE IX 

INDEXES OP NOMINAL AND KEAL WAGES, JULY, 1870. 1880, 

AND 1891 


SOURCE or DATA 

PRIMARY INDEXES 

DERIVED INDEXES 

1870 

1880 

1891 




Connecticut Mill 

! 



atari 



Nominal wages (aver- 







age) 

1.375 

1.207 

1.50 

100 

88 

109 

Prices 

1.47 

1.09 

.82 




Real wages 

.935 

1.11 

1.83 

100 

118 

196 

Mass. Cotton Mills 







Nominal wages (base, 







1860) 

166 

154 

172 

100 

93 

104 

Prices 

140 

105 

79 




Real wages 

118 

147 

217 

100 

125 

0 184 

U. S.-Aldrich Rejy>rt 







Nominal wage (base, 







1866) 

167.1 

143.0 

168.6 

100 

86 

101 

Prices (base, 1860) 

144.4 

104.9 

94.4 




Real wages 

116 

136 

179 

100 

118 

154 

U. S. Bureau of Lab. Stat. 







Nominal wage (base, 







1913) 

67 

60 

69 

100 

90 

103 

Prices (base, 1913) 

147 

109 

82 




Real wages 

46 

55 

84 

100 

121 

185 


are next reduced to a base of 1870, in which form they 
may be readily compared. It will be seen that the 
indexes of real wages in the Connecticut mill, based 
on very slender data though they are, do not differ 
markedly from the others. 



INDEXES OF WAGES AND PRICES 51 


A more complete statement of the results of Pro- 
fessor Howard’s study, and of the Bureau of Labor 
Statistics’ index, is presented in Table X. As in the 


TABLE X 

INDEXES OF WAGES AND PRICES, 1870-1920 


Year 

MASSACHUSETTS 

UNITED STATES 

Wages 

(Weekly) 

Wholesale 

Prices 

Real 

Wages 

Wages 
(Hr. Rates) 

Wholesale 

Prices 

Heal 

Wages 

1870 

166 

140 

60 

67 

147 

46 

1871 

177 

130 

58 

68 

136 

50 

1872 

183 

135 

58 

69 

141 

49 

1878 

178 

134 

56 

69 

140 

49 

1874 

163 

128 

54 

67 

133 

60 

1875 

160 

120 

53 

67 

125 

54 

1876 

145 

111 

55 

64 

116 

55 

1877 

142 

108 

55 

61 

113 

64 

1878 

145 

98 

63 

60 

102 

59 

1879 

144 

94 

65 

59 

98 

60 

1880 

154 

105 

62 

60 

109 

55 

1881 

149 

102 

62 

62 

106 

68 

1882 

157 

103 

64 

63 

108 

68 

1888 

158 

98 

69 

64 

102 

63 

1884 

155 

89 

74 

64 

92 

70 

1886 

150 

82 

77 

64 

86 

74 

1886 

153 

80 

80 

64 

84 

76 

1887 

160 

81 

83 

67 

84 

80 

1888 

164 

84 

83 

67 

87 

77 

1889 

169 

81 

89 

68 

84 

81 

1890 

173 

80 

91 

69 

81 

85 

1891 

172 

79 

92 

69 

82 

64 

1892 

172 

75 

97 

69 

76 

91 

1898 

180 

75 

102 

69 

77 

90 

1894 

168 

68 

104 

67 

69 

97 

1895 

165 

66 

105 

68 

70 

97 

1896 

175 

64 

116 

69 

66 

106 

189> 

174 

64 

116 

69 

67 

108 

1898 

171 

66 

110 

69 

69 

100 

1899 

164 

72 

97 

70 

74 

95 

1900 

189 

78 

102 

73 

80 

91 

1901 

190 

77 1 

104 

74 

79 

94 

1902 

191 

80 

101 

77 

85 

91 

1903 

197 

81 

103 

80 

85 

94 

1904 

196 

80 

103 

80 

86 

98 

1905 

200 

82 

108 

82 

85 

96 

1906 

216 

87 

105 

85 

88 

97 

1907 

240 

92 

111 

89 

94 

96 

1908 

228 

87 

111 

89 

91 

98 

1909 

211 

90 

99 

90 

97 

98 

1910 

209 

93 

95 

93 

99 

94 

1911 

207 

92 

96 

95 

95 

100 

1912 

223 

95 

99 

97 

101 

96 

1913 

227 

96 

100 

100 ! 

100 

100 

1914 

1916 

1916 

1917 

1918 

1919 

1920 

229 # 
(Base 
1860) 

95 

(Base 

1860) 

102 

102 

103 

111 

128 

162 

184 

(Sp £f ) 

(Summer) 

100 

101 

124 

176 

196 

212 

248 

102 




52 INTRODUCTION TO ECONOMIC STATISTICS 


preceding table, an index of real wages is derived by 
the use of an index of wholesale prices. The price in- 
dex shown for Massachusetts is merely an adaptation 
of the Bureau of Labor Statistics’ data for the United 
States. The index of real wages for Massachusetts 
has been changed from a base of 1860, as first derived, 
to a base of 1913, in order to allow of comparisons 
with the corresponding index for the United States. 
Both indexes point to the fact that real wages have 
about doubled in the interval from 1870 to 1914, but 
that the greater part of this increase came before 1890. 

Wholesale and Retail Prices. A question may be 
raised regarding the validity of using an index of 
wholesale prices as a measure of changes in the cost 
of living. Unfortunately, no adequate index of retail 
prices covering the years here studied is available. An 
index of wholesale prices has therefore been substi- 
tuted. It is a well established fact, however, that 
wholesale prices swing in the same direction and at 
nearly the same time as retail prices, but that they 
move somewhat more extremely. In the course of the 
usual moderate cyclic changes, therefore, the former 
will parallel the latter closely enough to serve as a 
substitute. But when the price swings are extremely 
low, the substitution will doubtless give an exagger- 
ated rise in real wages. Such is evidently the case in 
the decade from 1890 to 1900, when prices fell to the 
lowest point in the century. The apparent rise in real 
wages at that time should therefore be discounted, 
though just how much, cannot be accurately deter- 
mined. The opposite result is obtained during the 
great upswing of prices from 1914 to 1920. For these 



INDEXES OF WAGES AND PRICES 53 


years, however, adequate data on the cost of living 
are obtainable. In Table XI estimates derived from 
such data are used in the place of wholesale prices, 
and real wages are then computed. 1 

TABLE XI 


INDEXES OF WAGES AND COST OF LIVING 
UNITED STATES, 1913-1920 
(Estimated from Bureau of Labor Statistics 1 data) 


Year 

Wages 

Cost of Living 

Real Wages 

1913 

100 

100 

100 

1914 

102 

100 

102 

1915 

103 

100 

103 

1916 

111 

110 

101 

1917 

128 

134 

95 

1918 

162 

154 

105 

1919 

200 

180 

111 

1920 

225 

211 

107 


Indexes of the Cost of Living. The computation of 
an index of the cost of living involves many difficulties, 
both theoretical and practical. To begin with, the units 
employed often vary in quality, and are difficult to 
standardize. Hence the first step in the computation 
is the drawing up of a selected list of articles of staple 
grades. These articles must be sufficient in number 

x It is not intended that the wage data here studied shall be taken 
as a final measurement of the course of real wages. They were com- 
piled chiefly with the purpose in mind of illustrating certain methods of 
work. But it is the opinion of the writer, based on a study of such 
material as is available, that they represent a passably good estimate. 
Other studies, however, have purported to show a decline in real 
wages since the last decade of the nineteenth century. An interesting 
study of this sort appears in the American Economic Review, Septem- 
ber, 1921. In this study the cost of living is assumed to be measured 
by retail food prices. But this is a very questionable measure, inas- 
much as retail food prices have risen about as rapidly as wholesale 
prices since the period of agricultural depression. This abnormal rise, 
of course, makes real wages by comparison appear to fall. The study 
also uses an index of hour rates which purports to show a slower 
rise than is shown by the index published by the Bureau of Labor 
Statistics. It should be noted that the data here discussed do not cover 
wages of government employes. 



54 INTRODUCTION TO ECONOMIC STATISTICS 

and importance to represent fairly all the necessities 
commonly purchased by the average working-class 
family. After this has been done, the prices of these 
articles as sold in representative stores must be tabu- 
lated. If the index is to cover a considerable territory, 
data must be gathered from a number of localities. 
The common average of the prices of each article at a 
given date is found. Under certain conditions, as 
when the extreme items are likely to be in error, the 
median is preferable in place of the common average. 
Since the work of collecting and tabulating prices 
requires elaborate organization, it is done on a large 
scale by only a few agencies. One of these is the 
National Industrial Conference Board. 1 Another is a 
commission established by the Legislature of the State 

1 The National Industrial Conference Board is affiliated with the Na- 
tional Association of Manufacturers and other similar organizations, 
and has its headquarters in New York. It publishes among other things 
an excellent index of increases in the cost of living in the United 
States. This index is issued promptly each month, and is the best 
available source by which current changes may be measured. By per- 
mission, the indexes for the year 1920 are here reprinted. The base is 
July, 1914, and the figures show the per cent rise. 


Month, 

1920 

Cost of 
Living 

Food 

Shelter 

Clothing 

Fuel and 
Light 

Sundries 

January 

90.2 

97 

43 

170 

49 

77 

February 

93.5 

101 

45 

177 

49 

78 

March 

94.8 

100 

49 

177 

49 

83 

April 

May 

96.6 

100 

50 

188 

51 

83 

101.6 

111 

51 

187 

55 

83 

June 

103.0 

115 

51 

176 

61 

*85 

July 

104.5 

119 

58 

166 

66 

85 

August 

103.2 

119 

58 

155 

69 

85 

September 

99.4 

107 

59 

155 

78 

83 

October 

97.3 

103 

59 

148 

83 

90 

November 

93.1 

93 

66 

128 

100 

92 

December 

90.0 

93 

66 

105 £ 

100 

92 


Of these indexes the only one which shows a further rise up to. October, 
1921, is that for shelter. This registers 71% during March to June, 
inclusive, and 69% during the four months following. Fuel and Light 
and Sundries begin to decline after January. Food shows an increase 
for four months following a minimum of 45% in June, but this change 
is reflected in only a minor degree in the aggregate index. The apex 
of the post-war boom is, then* according to this index, in July, 1920. 



TABLE XII 

POOD COST OF LIVING PER FAMILY, JUNE, 1920, COMPARED WITH AVERAGE COST IN 1913 


INDEXES OF WAGES AND PRICES 55 


(8) 

EXTENSION 

(OCOOONHOOWWOOOWOOCOOOOOOOOWO 

HONO^OJOON(DON©aHOO^HcO«SW 

r- IN NNH CC iH 

(7) 

“weight” 

QOec2gcOCOU5’Hb-irtfOOOt>C<lOSQOOaO , ^»0<»W50 

r-uo CO W5 ca 

i— 1 HH 

(6) 

RELATIVE 

PRICE 

OOat'^WCSOHODHlOSoO ( »HtOCCHOOO{OeO 

rtHHHHHCQNHNHHHHNNNN^^HH 

(5) 

COST 

JUNE 

1920 

(MW®OOOoO«^<MOtO(N<OrttOD(M?OloNaCCO 

*rtCO00rHt^00<OO5<D00OJlOr-1o>iOC0Csl>*r-1^a0<N 

t'-tOSCDM®H®OiW®fOOU5®NNtt5U)N©0> 
co © ac ^ rj? as w os © ci ^ cq co co <© oi a* a> u? 

HHH rH IH tH CO ^ I/O <© 03 K CO iH 

(4) 

PRICE 
PER UNIT 

JUNE 

1920 

r-U0««Oa005t-C0O®03Q00lQ000qit*C0l>0lr-l 

©N^N®OC0N05©crj^H®HQC«!00O«a^ 

(3) 

COST 

1913 

QOtOOOoMOOaONOlWQONMONCIQQOinON 
MrOCO?000?005r^l>OiT^t-lOC5COrH(M^?000(MlO 
’H iH jH OS l> IO 1C OS CO 00 OCvl?OOSt>l>tOOOSO OS CO 

00 fC ni c4 t>-‘ IO IO fH WJ 03 os’ OS 00 rH CO IH od iH 

** 03 03 6a 5a iH r-t 

(2) 
PRICE 
PER UNIT 
1913 

^«QOOHoOOOOMlOcOH05®WONNwOO^ 

lOOQOSCO!MrHt-<r)IOrH^QO(MOOlOCOCCQOrHlOOSH< 

NNHHHNNNHNCOrtNOOOOOOONMS 

4h 

(1) 

CONSUMP- 

TION 

32 lbs. 
32 lbs. 
31 lbs. 
31 lbs. 

23 lbs. 
36 lbs. 
17 lbs. 

22 lbs. 

34 lbs. 

23 lbs. 
61 doz. 
66 lbs. 
12 lbs. 

337 (jts. 
531 lbs. 
264 lbs. 
54 lbs. 

35 lbs. 
704 lbs. 
147 lbs. 

40 lbs. 

8 lbs. 

ARTICLE OP 

POOD 

Sirloin steak 

Round steak 

Rib roast 

Chuck roast 

Plate beef 

Pork chops 

Bacon 

Ham 

Lard 

Hens 

Eggs 

Butter 

Cheese 

Milk 

Bread 

Flour 

Corn meal 

Rice 

Potatoes 

Sugar 

Coffee 

Tea 


Total $215,890 ) $479.435 1000 > 222012% 



56 INTRODUCTION TO ECONOMIC STATISTICS 


of Massachusetts. But doubtless the best known and 
most authoritative work is done by the Bureau of 
Labor Statistics. The methods used in combining 
average prices into a single index may be illustrated 
by the Bureau of Labor Statistics’ index of the food 
cost of living. 

An Index of Food Prices. In computing an index 
of the food cost of living, the Bureau of Labor Statis- 
tics has listed 22 important articles of food, as shown 
in Table XII. The use of this list was begun in Jan- 
uary, 1913, and was continued to January, 1921, when 
a revised list of 43 articles was substituted. Prices on 
these articles were compiled monthly from a number 
of important cities selected from the various terri- 
torial divisions of the United States. The averages 
of these prices for the year 1913 and for the month of 
June, 1920, are shown in Table XII, together with the 
annual consumption per workingman’s v amily as ascer- 
tained by a special investigation, which is here as- 
sumed to apply directly to 1913. From these data the 
increase in the food cost of living from the pre-war 
level to the climax of prices in 1920 may be computed. 

The Aggregate Method. The two standard methods 
of making the computation are shown in Table XII. 
The first and simplest of these is known as the aggre- 
gate method, and is illustrated in the first five columns 
of the table. The process consists merely in finding 
the total cost of a year’s supply of the given articles 
at the two contrasted dates (columns 3 and 5.) The 
total cost in June, 1920 ($479.44), is then divided by 
the total cost in 1913 ($215.89). The result shows that 
the former cost was 222% as compared with 100% in 
the base year, 1913 — an increase of 122%. While the 



INDEXES OF WAGES AND PRICES 57 


totals do not actually represent more than about two- 
thirds of the average annual food cost per family, yet 
they doubtless are representative enough to give a 
close approximation to the correct percentage increase. 

The Proportional Expenditure Method. A second 
and more complex process, known as the proportional 
expenditure method, is illustrated in columns 6, 7, and 
8. By this method the relative price in June, 1920, as 
compared with 1913 is found (column 6). By so doing, 
all prices, whether large or small, are obviously put 
on the same basis, since each is based on 100% in 1913. 
These relative prices are next averaged by a process of 
weighting similar to that explained in Chapter II in 
connection with Table IV. The weights, as shown in 
column 7, measure the relative importance in a work- 
ingman’s budget of an annual supply of each article. 
To illustrate, the first weight is obtained by dividing 
$8,128 by $215.89. This gives approximately 3.8%, but 
both the decimal point and the per cent sign are 
dropped as being unnecessary in a ratio. In the same 
way 'the second weight, 33, is obtained by dividing 
$7,136 by $215.89, and so on for the remainder of the 
weights. The accuracy of the work may be checked 
by means of the total, which should come to approxi- 
mately 1000. In the computation of such weights, it 
is usually unnecessary to strive for extreme accuracy ; 
as, for example, by carrying the figures to several 
decimal places. A considerable margin of error may 
be present in the weights without materially affecting 
the weighted average. 

A Comparison of the Two Methods. It will be ob- 
served that the result by the proportional expenditure 
method is exactly the same in this instance as that 



58 INTRODUCTION TO ECONOMIC STATISTICS 


obtained by the aggregate method. But this would not 
always be the case. Where there is a difference, pref- 
erence is likely to be given to the result obtained by the 
former method. The advantage claimed for this 
method arises from the fact that it is not practicable 
to make frequent revisions of the consumption esti- 
mates. When an estimate has become somewhat out 
of date, its inaccuracies may have considerable effect 
upon the results obtained by the aggregate method. 
But it is assumed that the proportional expenditure 
weights, based as they are upon both the prices and 
the consumption of a specific period, will remain rela- 
tively accurate for a longer period than will the con- 
sumption estimates taken alone. Hence their use is 
thought to give somewhat more dependable results. 
In practice it will be found that the proportional ex- 
penditure method is not as tedious as it seems in the 
illustration. The weights, when once obtained, may 
be used unchanged for a long period; and the finding 
of the relative prices does not usually mean additional 
work, since they are in any case desirable for purposes 
of comparison. 

The formula for the price index by the aggregate 
method is : 

P _ 2 P-q* 

n_ 2p oqi 

in which 

P n = price index for the relative period 
p n = prices for the relative period 
q* = quantities consumed during a given period, 
preferably the base period 
Po = prices for the base period 
The process may be described briefly as a comparison 



INDEXES OF WAGES AND PRICES 59 


of two price averages, both obtained by weighting for 
quantities. 

The formula for a price index by the proportional 
expenditure method is -, 1 

p 2 X p*qx 

An — po 

This process may be described as an average of rela- 
tive prices weighted for the value consumed during 
some specified period, often prior to the beginning 
of the interval over which the price comparison is be- 
ing made. In connection with both formulas, it must 
be understood that the factors are taken distributively ; 
that is, only prices and weights belonging to the same 
article are multiplied. 2 

An inspection of the formulas will show why the 
two results obtained in Table XII are alike. As is 
often the case, the year indicated by the subscript x, 
to which the quantities and weights are assumed to 
belong, is identical with the base year indicated by the 

1 The formula as thus stated does not indicate the reduction of the 
proportional expenditure weights to percentages. The value is, of 
course, unaffected by such reduction; and for purposes of later com- 
parison the form as given is preferable. 

* Use is sometimes made of the harmonic mean in finding the average 
of the relatives. This average is found by taking the reciprocals of 
the* relatives, computing their average, and then taking the reciprocal 
of this average. It may easily be seen that the effect of taking the 
reciprocals of the relatives is to reverse the base; that is, the later year 
becomes the base instead of the earlier year. If weights are used it is 
therefore preferable that they be derived from the data of the later 
year in finding the harmonic mean. Taking the reciprocal of the 
average again reverses the bases. But the index obtained by the usual 
direct method will not be quite the same as that obtained by the 
harmonic mean. This will ordinarily be the case whether the weights 
are shifted to the later base, or not; or indeed whether any weights 
are employed or not. The same distinction between the arithmetic and 
the harmonic mean is well brought out by taking a simple average 
of a given set of prices, and comparing it with the harmonic mean of 
the same prices. The latter is the reciprocal of the average of the 
quantities of each commodity purchasable for one dollar. 



60 INTRODUCTION TO ECONOMIC STATISTICS 


©<©OOCOr-|rHOiO><N«D©CCItf3<MCOO>»OOOt>»OOt'.eO 

©IO©rHTt 4 ^<M 00 © 00 lO©©^©U 5 ^QO<MI>U 500 

T*CO<N<MrHCO'<f^rHCOCOTt!&Jr-l©©©©©©CO© 


TH©CO© 00 (MC<|lrt©TlHOO©rH©rHQO©b-.© 0 >lr^CO 


t^©lO©WCOr^^Oir-IOOOO©W©<M'^rHQOCOeOiH 

HOOWb-ONWfO©HMt'-WWOt>t£)»OMHmO 


©©c^«o©©©a}eot~ot'-©©aoi>.ao©c<it'-i£)ao 

00 © 0 ©©OJ(NL'^eOl>.©WlOCO©©©CsIeo©©Tj 4 

cococooac<i«ro^ 5 ^eoco^iqeqr-j©©© # rH©©co© 

4* 


lOO©©t'-©©C<J©©i--!t-<M(M<M©aO , ^eoeOC<ICNl 

HCl^OlOHHOOl-OOQOXCOHat-lflO^OiOOO 

C 0 CvJCMCNr-tC 0 ^C 004 C\jTt(rt<(r 0 rH©©©rH©©eCO 


H * 

M O 

< MS 

H pq 1 ^ 

W Ig 

9 0J 
h 2ss 

-Q 
fr* W 


MW<MHOON^’tW?D‘O^COHCCTt(^Hl>OOHD 
(NN 0005 t'-fO«> 05‘0 03 N'^C 005 (NOOOl'!j 1 
WNOlHHcgNMHNCOCONOOOOOOOWW 


NOHHHCOOIHOOCOHCOCOOOOCQCCIHW^OIO 

WMO©(MO«D« 0 ^ 0 ^»OfCOOK^COffit-KDO^ 

(MNWHHCJtMWHWCOfOWOOOOOOOMlO 


< 3 i©rt<t^©Ol 0 C 0 © 00 C 0 Cv] 050 ic 0 ''^Cq 00 a 00 it-© 

l 0 f 0 O' 0 (M(Nt'-t»> 0 HincoJl 00 ©C 0 MQ 0 Hl 005 '^ 

N(MNHH(MNWHNCOCOIMOO© 0000 (M »0 


r^CCQ 0 OHO ©05 00 MWC 0 H 05 «DC 0 ©St-lf 500 Tt( 
IO<M©©(Mr-»t'-©inr-l'^QO(MQOlOeCeoaOTHkO©'^ 
N N H H H N N N H « M CO (N © © © © © O O N W 


aQWOQOQOQOQOOOOaoOONoQaOQQwaioQOQCOtQWDQ 

^QjQjQjQjQj^jCjCjCjQ^jQjQ £££££££ 

©©©©©Tt<lOlOTt<QOlOt^<OlOlO^I>K 5 C< 10 l>TH 
SNb*l>lMHiniOQO<OoOHHlO<MWOJ<MOOCO^H 
iH rH CO (M CM OC CM 


» TO ‘ on oj* ® 

•S.g« 1 ® g* 

* • S 2,8 J 

d’rrt O VJ W 


flllf* | i'g § 1! 1^1 1 s ii 

.ft o pd ~ o dec d o bo P pd id m fS o o 

a» ev» /v r\. mhi, tM m <r> P? 3$ m r< o* * . , 


I .ft o ^ pd --2 o rt « ci & bo P pd id *■< o po 3> 

IoqP3MO(1iPhPQWh3WHPQoSPQPrOPh^coOB 



INDEXES OF WAGES AND PRICES 61 


subscript o. The numerator of the second formula 
therefore reduces to the same form as that of the first 
formula, and the denominators also have the same 
value. But the assumption that the quantities as given 
apply directly to the year 1913 was adopted merely to 
simplify the table for purposes of illustration. As a 
matter of fact, these quantities were obtained by an in- 
vestigation made in 1918, and were not put into use 
until the beginning of 1921. The consumption esti- 
mates actually used by the Bureau of Labor Statistics 
from 1913 to 1920, inclusive, are shown in Table 12-A, 
together with the annual retail food prices for the 
same period, and for June, 1921. It will be seen that 
these estimates go back to the year 1901. 

Limitations. Some of the limitations of the fore- 
going methods of computing changes in the cost of liv- 
ing may be easily seen. It is evident that the assum- 
ing of an unvarying consumption may sometimes result 
in material inaccuracies. If there is a very uneven 
rise in prices, buyers will begin to substitute the 
cheaper article for the more expensive, wherever this 
can be properly done. The effect on the one hand 
will be to moderate the unevenness of price changes, 
but on the other hand it will make material altera- 
tions in the budget. But the frequent revision of con- 
sumption data, and the complexity of methods of com- 
putation, would involve more labor than at present 
seems practicable to allow. Besides, the revision of 
consumption data raises another problem. Changing 
quantities may in part reflect rising standards of com- 
fort whereas the term, “cost of living,” implies pro- 
vision only for those necessities which are required to 
maintain the social efficiency of the family. Strictly 



62 INTRODUCTION TO ECONOMIC STATISTICS 


speaking, the measurement of changes in such costs 
would call for the dietician’s units of protein, carbohy- 
drates, fats, and calories, and would require very de- 
tailed analyses. Hence, as far as the cost of living 
is concerned, we are driven back to the comparatively 
simple methods actually in use. The results obtained 
by such methods must, of course, always be taken as 
approximations only. 

Combining the Partial Indexes. In measuring 
changes in the entire cost of living, the Bureau of 
Labor Statistics makes an index for several groups of 
commodities, in addition to food. The process of com- 
bining these partial indexes into a single measure is an 
application of the proportional expenditure method. 
The budgetary studies of 1918 furnish the weights 
now in use. The partial indexes in June, 1920, the 
weights, and the process of finding a single index, may 
be seen in the following table : 


Index Weight 
Item of June, 1920 (percent 

Expenditure (base, 1913) of budget) Extension 

Food 219.0 38.2 8365.80 

Clothing 287.5 16.6 4772.50 

Housing 134.9 13.4 1807.66 

Fuel and light 171.9 5.3 911.07 

Furniture and 

furnishings 292.7 5.1 1492.77 

Miscellaneous 201.4 21.3 4289.82 


99.9 ) 21639.62 
216.5 

Indexes of Wholesale Prices. Though neither the 
aggregate nor the proportional expenditure method is 
theoretically perfect, yet one or the other is used in 
practically all the price indexes in common use. In 



INDEXES OF WAGES AND PRICES 63 

adapting the latter method to wholesale prices, the 
weights are usually made to reflect the proportional 
value produced, rather than family consumption, but 
the principle is essentially the same. Applying this 
method, the Bureau of Labor Statistics compiles an 
elaborate monthly index of wholesale prices. The 
number of items tabulated, the base used, and other 
details of the work have been varied from time to 
time, but the partial results have been combined into a 
single index having the same base. For several years 
prior to the war, the base used was an average of the 
decade 1890-99, which was a period of unusually low 
prices. But in more recent years, 1913 has been em- 
ployed because it may be considered representative 
of normal conditions immediately preceding the war. 
As compiled for January, 1920, the index is derived 
from a weighted average of 327 price quotations. 
The items are classified under nine headings, each 
having its own partial index. 1 This is the most com- 

1 Annual group index numbers for the years 1890-1920 will be found 
in the Monthly Labor Review, Feb., 1921, page 45. The figures for 
recent dates are as follows: 

GROUP INDEX NUMBERS— UNITED STATES— BUREAU OF 
LABOR STATISTICS 

For the Years 1913-1920, and Sept., 1920 and 1921 


Period 

| i 

fc O' 

Food, 

etc. 

T3 

Us 

if 

6*3 

c 

«.£ 

T3 

Li 

ll - ? 

a a S. 

T3 

a 

||| 

ill 

IS 

Ji 

Ilf 

u£ £> 

Miscel- 

laneous 

1913 

100 

100 

100 

100 

100 


100 

100 

100 

1014 


103, 

98 

96 

87 

97 

101 

99 

99 

1915 

105 

104 

100 

93 

97 

94 

114 

99 

99 

1916 

122 

126 

1JS8 

119 

148 

101 

159 

115 

120 

1917 

189 

176 

181 

175 

208 

124 

198 

144 

166 

1918 

220 

189 

239 

163 

181 

161 

221 

196 

193 

1919 

234 

210 

261 

173 

161 

192 

179 

236 

217 

1920 

218 

239 

302 

238 

186 

808 

210 

366 

236 

Sept. 

1920 

210 

223 

278 

284 

192 

318 

222 

371 

289 

Sept. 

1921 

122 

146 

187 

178 

120 

193 

162 

223 

146 











64 INTRODUCTION TO ECONOMIC STATISTICS 


prehensive study of wholesale prices published, but 
for business uses it has the disadvantage of appearing 
two or three months late. 

One of the most widely used commercial indexes of 
wholesale prices is Bradstreet’s. This index is based 
on about one hundred important articles, for which 
prices are easily available in the principal business 
centers. It is published promptly each month, and 
is therefore valuable to the business man who desires 
to keep in touch with the immediate trend of the mar- 
ket. The aggregate method of computation is em- 
ployed, and the result is given merely as a sum of 
money, which may be reduced to any desired base. 
The lack of a definite system of weights is a defect 
of this index, but it is partially compensated for by a 
careful selection of items, and a repetition of important 
articles by a quotation for more than one grade. 

Dun’s monthly index of wholesale prices is perhaps 
not as widely known as Bradstreet’s, but it appears to 
be more scientifically compiled. It is based upon ap- 
proximately 300 price quotations, which are grouped 
under seven heads, and finally combined into a single 
index, the result being stated in dollars. Prices are 
weighted in accordance with estimated per capita con- 
sumption. Just how the weights are applied is not 
apparent. Commercial houses, as a rule, do not pub- 
lish the details of their statistical work. 

The Federal Reserve Board has begun the publica- 
tion of a monthly index of wholesale prices, based on 
about 90 commodities. It is intended for use particu- 
larly in international comparisons, but its value is 
not limited to this field. 1 Still another index intended 

1 The Federal Reserve Board classifies its data so as to present indexes 



INDEXES OF WAGES AND PRICES 65 


to measure wholesale price movements is published by 
the Babson Statistical Organization. This index is 
compiled monthly from quotations on ten basic com- 
modities, and in spite of its narrow base, serves a use- 
ful purpose. A number of other less complete indexes 
might be mentioned, some of which appear weekly. 
One of these is the Annalist’s weekly index of whole- 
sale food, a weighted average of 25 prices. It is obvious 
that there may be many price indexes giving somewhat 
different results, yet each one valid in its own sphere. 
The purpose a given index is to serve will always con- 
trol the selection of items and the weighting. 

As a means of comparing some of the indexes just 
mentioned, Table XIII is given, showing several whole- 
sale price indexes for the year 1913, and by months 
for 1920. The data for 1920 are of interest because 
they contain the peak of prices for the war and post- 
war cycle. 1 

TABLE XXII 

INDEXES OP WHOLESALE PRICES 


Period 

Bureau 
of Labor 
Statistics 

Brad- 
street ’s 

Dun ’s 

Federal 

Reserve 

Board 

Babson ’s 

1913 

100 

$9.2115 

$120.8865 

100 

$1.26 

1920 

Jan. 

248 

20.3638 

247.394 

242 

3.30 

Feb. 

249 

20.8690 

253.748 

242 

3.44 

Mar. 


20.7950 

253.016 

248 

3.59 

Apr. 

265 

20.7124 

257.901 

263 

3.61 

May 

272 

20.7341 

263.332 

264 

3.66 

June 

269 

19.8752 

262.149 

258 

3.71 

July 

262 

19.3528 

260.414 

250 

3.60 

Aug. 


18.8273 

252.288 

234 

3.43 

Sept. 

242 

17.9746 

248.257 

226 ! 

3.39 

Oct. 

225 

16.9094 

237.341 

208 

3.25 

Nov. 


15.6750 

227.188 

190 

2.98 

Dec. 

189 

13.6263 

211.628 

171 

2.75 


for goods imported and exported, and producers 9 and consumers’ goods, 
as well as for certain commodity groups. These indexes are published 
monthly in the Federal Reserve Bulletin. 

*In comparing these data, it should be noted that the commercial 
indexes (Bradstreet ’s, Dun’s, and Babson ’s) are based on price quota- 
tions for the first of each month. 














66 INTRODUCTION TO ECONOMIC STATISTICS 

Other Indexes. In addition to indexes of commodity- 
prices, many other indexes, measuring various phases 
of business activity, are published. Speculative and 
investment activities are measured by indexes of stock 
and bond prices. A good index of stock prices is dif- 
ficult to make because of the continually changing 
status of the corporations issuing the stocks. The 
indexes now in common use are based on quotations of 
standard shares listed on the New York Stock Ex- 
change. Railroad and industrial shares are usually 
compiled separately, and the two sets combined into a 
composite index. Financial and productive activities 
in the general markets are measured by a variety of 
data, such as the number of shares traded on the New 
York Stock Exchange, bank clearings in New York 
and in the country as a whole, bank deposits, money 
in circulation, gold movements, the interest rate, the 
production of pig iron, the number of building per- 
mits issued in leading cities, the number and extent of 
business failures, and the balance of foreign trade. 
Labor conditions are measured by wage and employ- 
ment data, examples of which are the reports of the 
New York State Industrial Commission. Retail trade 
is indicated by reports of department store sales. 
Most of these various indexes as they appear in the 
financial papers, consist of mere statements of periodi- 
cal figures. Certain phases of their statistical elabora- 
tion will be considered later under the subject of 
trends. 

REFERENCES 

Barnett, George E., “Index Numbers of the Total Cost of 

Living,” Quarterly Journal of Economics, February, 1921, 

pp. 240-263. 



INDEXES OF WAGES AND PRICES 67 


Bowley, Arthur L., 4 4 The Measurement of Changes in the 
Cost of Living” (and discussion), Journal of the Royal 
Statistical Society , May, 1919, pp. 343-372. 

Fisher, Irving, Stabilizing the Dollar, Chapters I-III. 

Howard, Stanley E., The Movement of Wages in the Cotton 
Manufacturing Industry of New England. 

Meeker, Royal, 4 4 Some Features of the Statistical Work of 
the Bureau of Labor Statistics, ’ 9 Quarterly Publications of 
the American Statistical Association, March, 1915, pp. 431- 
441. 

Mitchell, Wesley C., Index Numbers of Wholesale Prices in 
the United States and Foreign Countries, Bulletin No. 173 
(Whole Number), Bureau of Labor Statistics, July, 1915. 

Seerist, Horace, An Introduction to Statistical Methods, 
Chapters IX and X. 

Seerist, Horace, Readings and Problems in Statistical 
Methods, Chapter VIII. 

Stewart, Walter W., “Prices During the War,” Quarterly 
Publications of the American Statistical Association, Sep- 
tember, 1920, pp. 305-313. 


EXERCISES 

1. Reduce the three indexes shown on page 53 to a base 
consisting of the average of each series, respectively. 

2. Reduce Bradstreet ’s, Dun's and Babson’s indexes of 
wholesale prices for 1920 (page 65) to a 1913 base. 
Compare the divergence of these indexes, and the others 
given in the same table, for the month of May by the 
use of the coefficient of average deviation. 

3. (a) Plot on the same sheet of semi-logarithmic paper the 
two indexes of real wages given in Table X, page 51. 

(b) Plot together on the same chart the indexes of 
wages and cost of living shown in Table XI, page 53. 

4. Assuming tlfat the ratio of American to European com- 
modity prices — both price levels being stated in indexes 
having 1913 as a base — measures the relative deprecia- 
tion of European currencies, find the theoretical value 
of these currencies for September, 1921, using the data 
given below ( Federal Reserve Bulletin, Nov., 1921). 



68 INTRODUCTION TO ECONOMIC STATISTICS 


Express the market prices of exchange as percentages 
of the theoretical values. 


Index of 

Wholesale Prices 

Exchange in 
New York. 
(Cables, per 

Country 

September, 1921 

cent of par) 

United States . . 

. . . . 143 

— 

United Kingdom 

... 191 

76.5 

France 

. . . . 344 

37.7 

Italy 

. . . . 580 

21.8 

Germany 

. . . . 1777 

4.0 

Sweden 

. . . . 182 

81.3 

Norway 

. . . . 287 

48.0 

Denmark 

. . . . 224 

65.9 


5. The following table gives a monthly index of wholesale 
prices obtaining in the United Kingdom (“Statist” in- 
dex) and the average monthly price in New York of 
sterling exchange (cables), for the year 1920. Using the 
Federal Reserve Board index (page 65) as a measure 
of the price level in the United States, find the theoretical 
value of the pound sterling (par, $4.8665) each month 
in terms of American money. Compare the results thus 
obtained with the cost of sterling exchange by graphing 
both series. (The graph should be drawn so as to show 
the zero line at the base. Why?) 


1920 


Statist 

index 

(Base, 1913) 

Sterling 

cables 

(New York)* 

January . . . 


288 

$3.68 

February . . , 


306 

3.39 

March 


307 

3.72 

April 


313 

3.93 

May 


305 

3.85 

June 


300 

3.95 

July 


299 

3.86 

August 


298 

3.63 

September . . 


292 

3.52 

October 


282 

3.47 

November . . 


263 

3.43 

December . . 


243 

3.63 



INDEXES OF WAGES AND PRICES 69 

6. From the data of Table XII-A, page 60, find index num- 
bers of the food cost of living, 1913-1920, inclusive, and 
June, 1921 ; base, 1913. Compare the results with the fol- 
lowing index published by the Bureau of Labor Statistics: 

Index of 


Year 

food costs 

1913 

100 

1914 

102 

1915 

101 

1916 

114 

1917 

«. . . 146 

1918 

168 

1919 

186 

1920 

203 

June, 1921 

144* 


* Based on 43 articles. 

7. Prom the average prices given below, find the relative 
prices in 1918 as compared with 1913. Find also the 
weighted average of these 4 ‘ relatives,’ ’ applying the 


weights given. 



1913 

1918 

Weights 

"Wheat, bushel 

$1.04 

$2.31 

4 

Corn, bushel 

.71 

1.84 

10 

Cotton, pound 

13 

.32 

5 

Iron, ton 

14.90 

36.52 

3 

Copper, pound 

16 

.25 

1 


8. On the basis of the following prices of five articles of 
food, and the relative importance of each in the family 
budget, find the increase in the food cost of living be- 
tween the dates given : 

Price in Price in 


Article 1913 1920 Importance 

Steak, pound $0.22 $0.40 70 

Milk, quart 09 .17 140 

Bread, pound 06 .12 140 

Butter, pound 38 .70 115 

Sugar, pound 06 .20 40 


• 

9. Compute a set of proportional expenditure weights on 
the basis of 1901 consumption and 1913 prices. Apply 
these weights to the relatives given in column. 6, Table 
XII, page 55. 



70 INTRODUCTION TO ECONOMIC STATISTICS 


10. From the following table, compute an index of agricul- 
tural real wages having 1913 as its base. Compare the 
results with the corresponding index of hour rates in 
the United States by graphing both on the same chart 
for the years in which farm wages are given. (The base 
line of the graph should show each year from 1875 to 
1920. Why?) 


WAGES OF CERTAIN CLASSES OF MALE FARM LABOR BY THE 
MONTH WITHOUT BOARD 

(“Monthly Labor Review , 99 July, 1920, and March, 1921.) 


Year 


Wage 


1875 

1879 

1882 

1885 

1888 

1890 

1892 

1893 

1894 

1895 

1898 

1899 
1902 

1910 

1911 

1912 

1913 

1914 

1915 

1916 

1917 

1918 

1919 

1920 


19.87 

16.42 

18.94 
17.97 
18.24 
18.33 
18.60 
19.10 
17.74 
17.69 
19.38 
20.23 

22.14 
27.50 
28.77 
29.58 
30.31 

29.88 

30.15 
32.83 

40.43 
48.80 
56.29 

64.95 


11. The Bureau of Labor Statistics found the increases in 
the cost of living for various classes of commodities in 
the United States, from 1913 to the year and month 
indicated to have been as follows : 



INDEXES OF WAGES AND PRICES 71 


Per Cent of Increase 

Item of Dec. Dec. Dec. Dec. Dec. Dec. Dec. 

Expenditure 1914 1915 1916 1917 1918 1919 1920 

Pood 5.0 5.0 26.0 57.0 87.0 97.0 78.0 

Clothing 1.0 4.7 20.0 49.1 105.3 168.7 158.5 

Housing 0.0 1.5 2.3 .1 9.2 25.3 51.1 

Fuel and light 1.0 1.0 8.4 24.1 47.9 56.8 94.9 

Furniture and 

furnishings 4.0 10.6 27.8 50.6 113.6 163.5 185.4 

Miscellaneous 3.0 7.4 13.3 40.5 65.8 90.2 108.2 


Using the weights given on page 62, find the total in- 
crease in the cost of living, and the index of the same, 
for the periods named in the foregoing table. Compare 
the results with those given in the Monthly Labor Re- 
view, November, 1921, page 83. 

12. Using price quotations obtained locally, and from cata- 
logs from which local purchases are commonly made, 
find the increase in the cost of living between two given 
dates, preferably 1913 or 1914 and the present time. In 
finding the food index, make use of the proportional ex- 
penditure weights given in Table XII, page 55. For 
the other groups of items, make use of the following pro- 
portional expenditure weights quoted from Massachusetts 
House Report, No. 1500. Items for which quotations are 
not available may be dropped, or substitutions may be 
made. 


WEIGHTINGS IN THE CLOTHING INDEX 
MEN’S 


Overcoatl 

Suit V 39 

Trousers J 

Shoes 15 

Hats 4 

Gloves 6 

Socks 4 

Shirts 6 

Collars 2 

Underwear 6 

Night Garments 2 

Total 84 



72 INTRODUCTION TO ECONOMIC STATISTICS 


WOMEN'S 

Suit 
Topcoat 
Street Dress 
Underwear 
Waists 
Kimono 
House Dress 
Aprons 
Night Gown 
Underskirt 

Shoes 

Gloves 

Hosiery 

Corsets 

Hats 



27 

5 

18 


12 

3 
2 

4 
9 


Total 


80 


SHELTER INDEX 

Obtain rentals of several representative homes for the 
two dates required, and find the average per cent in- 
crease. 


WEIGHTINGS IN THE FUEL INDEX 

Coal 10 

Kerosene 1 

Gas 2 

Electricity 2 

Total 15 

WEIGHTINGS IN THE SUNDRY INDEX 

Ice 15 

Carfare 15 

Entertainment 25 

Medicine 25 

Insurance 50 

Church 30 

Tobacco, etc ! 20 

Beading 10 

House furnishings 45 

Organizations 25 

Total 260 



INDEXES OF WAGES AND PRICES 73 

Combine the various partial indexes by means of the 
following proportional expenditure weights: 


Food 43.1 

Shelter 17.7 

Clothing 13.2 

Fuel and light 5.6 

Sundries 20.4 



CHAPTER IV 


QUANTITY INDEXES AND THEIR USES 

Value and Quantity Indexes. Students of general 
market conditions are interested in changes in the 
volume of production, since such changes have an inti- 
mate relation to prices, wages, and business activity. 
The attention of statisticians has therefore been re- 
cently turned to the development of indexes of pro- 
duction. Such indexes may be of two sorts: (1) an 
index of value production, measuring the number of 
dollars’ worth of goods created in a given period of 
time; and (2) an index of physical production, meas- 
uring the same output primarily in such physical units 
as pounds, bushels, and yards. Somewhat akin to the 
latter is an index of the physical volume of trade, 
which attempts to measure the total number of physical 
units traded in a given period. Because of the sea- 
sonal nature of a large part of industry, indexes of 
production are generally based on the year as the unit 
of time. 1 

Indexes of value production do not call for an ex- 
tended discussion, since they are merely inventories of 
representative annual output at the average current 
prices. The process of finding them may be illus- 

1 One of the most practical uses of production index numbers, how- 
ever, requires monthly or weekly data, corrected for seasonal variations. 
These data are beginning to be used in connection with business 
barometrics, as will be noted in the next chapter. 

74 



QUANTITY INDEXES AND THEIR USES 75 

trated by multiplying each year’s output of wheat, 
corn, cotton, pig iron, and copper, as shown in Table 
XIV, by their respective prices as shown in Table 
XI V- A. The resulting annual totals may be taken pro- 
visionally as an index of value production of raw 
materials, and may be reduced to any desired base. 
Such indexes are not, however, of much use in them- 
selves, except as they may be employed in connec- 
tion with the computation of physical production and 
price indexes. 

If the total value production for a given year, as 
measured by an adequate index, shows an increase over 
the preceding year, this increase may evidently be at- 
tributed either to a growing volume of physical pro- 
duction, or to rising prices, or to a combination of both 
factors. It therefore follows that suitable indexes of 
general prices and of physical production, multiplied 
across year by year, must give an index of value pro- 
duction. This fundamental principle may be expressed 
by the formula : 1 

PnQn = V„ 

in which 

P n = the price index for a given year 

Q n = the physical production index for the same 
year 

V n = the value production index for the same year 
The formula may also be written : 

P„ = ^,andQ.=^ 

An Index of Physical Production. The statistical 

‘In applying this formula, it is preferable that the two given 
indexes should be reduced to the same base, but this is not mathe- 
matically essential. It is not the absolute value of the derived index 
numbers that is important, but only their ratios to each other. 



TABLE XIV 1 

PRODUCTION OF SPECIFIED COMMODITIES, U. S., 1870-1920 


Yea* 

Wheat 
(millions 
OF BUSHELS) 

Corn 

(MILLIONS 
OF BUSHELS) 

Cotton 

(MILLIONS 
OF BALES) 

Pig Iron 

( MILLIONS 
OF TONS) 

Copper 
(mil- 
lions OF 

POUNDS) 

1870 

236 

1,094 

4.352 

1.665 

28 

1871 

231 

992 

2.974 

1.707 

29 

1872 

250 

1,093 

3.931 

2.549 

28 

1873 

281 

932 

4.170 

2.561 

35 

1874 

308 

850 

3.833 

2.401 

39 

1875 

292 

1,321 

4.632 

2.024 

40 

1876 

289 

1,284 

4.474 

1.869 

43 

1877 

364 

1,343 

4.774 

2.067 

47 

1878 

420 

1,388 

5.074 

2.301 

48 

1879 

449 

1,548 

5.755 

2.742 

52 

1880 

499 

1,717 

6.606 

3.835 

60 

1881 

383 

1,195 

5.456 

4.144 

72 

1882 

504 

1,617 

6.950 

4.623 

91 

1883 

421 

1,551 

5.713 

4.596 

116 

1884 

513 

1,796 

5.682 

4.098 

145 

1885 

357 

1,936 

6.576 

4.045 

166 

1886 

457 

1,665 

6.505 

5.683 

158 

1887 

456 

1,456 

7.047 

6.417 

181 

1888 

416 

1,988 

6.938 

6.490 

226 

1889 

491 

2,113 

7.473 

7.604 

227 

1890 

402 

1,490 

8.653 

9.203 

260 

1891 

612 

2,060 

9.035 

8.280 

284 

1892 

516 

1,628 

6.700 

9.157 

345 

1893 

396 

1,619 

7.493 

7.125 

329 

1894 

460 

1,213 

9.901 

6.658 

354 

1895 

467 

2,151 

7.161 

9.446 

381 

1896 

428 

2,284 

8.533 

8.623 

460 

1897 

530 

1,903 

10.898 

9.653 

494 

1898 

675 

1,924 

11.189 

11.774 

527 

1899 

547 

2,078 

9.393 

13.621 

569 

1900 

522 

2,105 

10.102 

13.789 

606 

1901 

748 

1,523 

8.583 

16.878 

602 

1902 

670 

2,524 

10.588 

17.821 

660 

1903 

638 

2,244 

9.820 

18.009 

698 

1904 

552 

2,467 

13,451 

16.497 

813 

1905 

693 

2,708 

10.495 

22.992 

889 

1906 

735 

2,927 

12.983 

25.307 

918 

1907 

634 

2,592 

11.058 

25.781 

869 

1908 

665 

2,669 

13.086 

15.936 

943 

1909 

737 

2,772 

10.073 

25.795 

1,093 

1910 

635 

2,886 

11.568 

27.304 

1,080 

1911 

621 

2,531 

15.553 

23.650 

1,097 

1912 

730 

3,125 

13.489 

59.727 

1,243 

1913 

763 

2,447 

13.983 

30.966 

1,224 

1914 

891 

2,673 

15.906 

23.332 

1,150 

1915 

1,026 

2,995 

11.068 

29.916 

1,388 

1916 

636 

2,567 

11.364 

39.435 

1,928 

1917 

637 

3,065 

11.302 

38.621 

1,890 

1918 

921 

2,503 

12.041 

39.055 

1,994 

1919 

941 

2,917 

11.421 

31.015 

1,289 

1920 

787 

3,232 

13.366 

36.415 

1,345 


1 This table and the one following are taken with some modifications 
from Babson’s Business Barometers, by permission of the author. 

7Q 


TABLE XIV— A 


AVERAGE PRICES OF SPECIFIED COMMODITIES, U. 8,. 
. (EASTERN MARKETS), 1870-1920 


YEAE 

WHEAT 
PEE BU. 

COEN 

PEE BU. 

COTTON 
PEE BARE 

PIQ IEON 
PEE TON 

COPPEE 
PEE LB. 

1870 

1.30 

1.02 

mmm 

33.23 

0.211 

1871 

1.60 

.77 

84.50 

35.08 

.241 

1872 

1.62 

.70 


48.94 

.355 

1873 

1.76 

.63 

100.50 

42.79 

.280 

1874 

1.39 

.86 

89.50 

30.19 

.220 

1875 

1.33 

.84 

77.00 

25.53 

.226 

1876 

1.35 

.628 

64.50 

20.75 

.210 

1877 

1.63 

.593 

59. 

19.25 

.190 

1878 

1.24 

.535 

56. 

17.05 

.165 

1879 

1.24 

.47 

54. 

22.82 

.186 

1880 

1.30 

.55 

57.50 

29.86 

.214 

1881 

1.30 

.62 

60. 

22.54 

.181 

1882 

1.32 

.77 

57.50 

23.20 

.191 

1883 

1.17 

.64 

59. 

19.62 

.165 

1884 

1.00 

.615 

54.50 

16.80 

.110 

1885 

.94 

.51 

52. 

15.20 

.108 

1886 

.888 

.52 

46. 

16.77 

.110 

1887 

.88 

.488 

51. 

20.05 

.138 

1888 

.94 

.593 

50. 

16.82 

.167 

1889 

.91 

.438 

53. 

14.35 

.134 

1890 

.92 

.485 

55. 

15.10 

.156 

1891 

1.05 

.675 

43. 

13.78 

.127 

1892 

.908 

.54 

38.50 

12.74 

.115 

1893 

.739 

.499 

42.50 

11.42 

.107 

1894 

.611 

.509 

34.50 

9.93 

.095 

1895 

.669 

.477 

37. 

10.86 

.105 

1896 

.781 


39.50 

10.29 

.109 

1897 

! .954 

.319 

35. 

9.42 

.113 

1898 


.376 

29.50 

9.46 

.120 

1899 

.794 

.413 

34. 

16.58 

.177 

1900 

.804 

.453 

46. 

17.04 

.166 

1901 

.803 

.567 

43.50 

13.61 

.161 

1902 

.836 

.684 

45. 

20.00 

.116 

1903 


.572 

55.50 

17.08 

.132 

1904 

1.107 

.594 

58.50 

12.73 

.128 

1905 

1.028 

.593 

49. 

15.57 

.156 

1906 

.865 


57.50 

16.70 

.193 

1907 



60.50 

23.10 

.200 

1908 

1.049 

.786 

53. 

15.54 

.132 

1909 

1.263 

.767 

63. 

16.12 

.131 

1910 

1.118 

.668 

75.50 

15.16 

.129 

1911 


.711 

65. 

13.67 

.125 

1912 


.711 

57.50 

14.93 

.164 

1913 

1.041 

.711 

64. 

14.90 

.155 

1914 

1.094 

.793 

55.50 

13.41 

.133 

1915 

1.291 

.837 

50.50 

13.58 

.174 

1916 

1.468 

.929 

72. 

18.67 

.272 

1917 

2.346 

1.776 

117.50 

40.07 

.272 

1918 

2.31 

1.840 

158.50 

36.52 

.247 

1919 

2.34 

1.771 

161.50 

32.16 

.192 

1920 

2.65 

1.669 

173. 

44.03 

.175 


77 








78 INTRODUCTION TO ECONOMIC STATISTICS 


process of developing an index of physical production 
may be illustrated by the use of the limited data shown 
in Table XIV. In aggregating bushels, bales, tons, 
and pounds for any given year, it will be hardly admis- 
sible to add the units as tabulated. While the numbers 
might be taken abstractly and thus combined into an 
index, yet such a process would give as much weight 
to a bushel or a pound as to a bale or a ton. It is true 
that we might here reduce all our units to pounds, but 
even if this were done a pound of copper should be 
stressed more than a pound of corn or of iron, because 
of its greater importance in the markets. The simplest 
way to give each item of production its proper place in 
the total will be to remeasure it in terms of a standard 
value. That is, we may take as the physical unit the 
amount of each commodity that can be bought for a 
dollar at a standard price. The number of such units 
for each year may then be summated as an index of 
physical production. 1 Expressed algebraically, the 
process is : 

Qn = 2p m (J n 

In so far as the complete index is concerned, this is 
equivalent to averaging the quantities as originally 
tabulated, weighting them for standard price (pm). 

A Standard Price. The term “standard price” has 
been used to imply a price which may be regarded as 
representative of a given commodity for the whole in- 

*It can be shown, as follows, that the product of c price and quantity 
can properly be taken as physical units: 

Let p = price per pound in dollars 

and n = number of pounds in a given output. 

Then 1/p = number of pounds purchasable for one dollar (the new 
physical unit). 

and n -j- 1/p = np, the number of new physical units in the output. 



QUANTITY INDEXES AND THEIR USES 79 

terval under consideration. Since the standard price 
is to be used virtually as a weight, it need not be deter- 
mined with very great accuracy. But from the point 
of view of the “theory of errors,” it should be the 
average price during the whole interval of time which 
is covered by the series of quantity indexes dependent 
upon it. It follows, therefore, that in periods of rap- 
idly changing prices, somewhat different comparative 
results may be obtained by varying the interval of 
time over which the comparisons are made. This may 
seem anomalous, but it arises inevitably from the fact 
that the concept of aggregate quantity requires a unit 
dependent upon value ; namely, a dollar’s worth. Since 
value is unstable, the unit of quantity is also unstable. 
In practice, however, this instability is usually insig- 
nificant. 

The application of the foregoing principles to the 
data of Table XIV requires the finding of a standard 
price for each of the commodities tabulated. As a 
base for this price, a period of about twenty years 
prior to the war has been chosen, the purpose being to 
exclude war prices as extreme, and to emphasize the 
later rather than the earlier part of the half century 
studied. 1 The averages have been taken approxi- 
mately, and have been slightly modified by the use of 
data not here cited. They are as follows : 

‘Because of the shifting of relative prices, it might be theoretically 
preferable to subdivide the half century under consideration into at 
least three different periods (e. g., 1870-1896, 1896-1914, and 1914- 
1920), and to derive different sets of weights for use in each period. 
The series of indexes could be brought together in the overlapping 
years by a simple adjustment of the bases. But the slight gain in 
accuracy thus made would not be worth while here, since in any case 
the results obtained from such meager data cannot be considered as 
anything more than approximations. 



80 INTRODUCTION TO ECONOMIC STATISTICS 


Wheat, per bushel $1.04 

Corn, per bushel 64 

Cotton, per bale 56.00 

Pig Iron, per ton 16.00 

Copper, per pound 16 


In applying these prices, it should be remembered that 
they are in effect weights, and so may be treated as a 
multiple ratio. They may therefore be changed by 
division into the more convenient form 13:8:700:200:2. 

Physical Production in the United States. After the 
standard prices have been decided upon, each annual 
item of production is multiplied by its appropriate 
weight. Two sub-totals are taken for each year, one 
for crops and one for minerals. These sub-totals con- 
stitute two provisional index series. The completed 
indexes are reduced to a base of 1913. A further step 
may be taken by noting the fact, as shown in various 
statistical studies, that an index of mineral production 
runs very close to an index of manufactures. By suit- 
able weighting, the indexes of crops and minerals may 
therefore be combined so as to include, in effect, an 
estimate for the value added by manufacturing. By 
a comparison of aggregate values and a little experi- 
mentation, the requisite weights may be placed at six 
for crops and four for minerals, manufactures being 
theoretically included in the latter. This process of 
weighting and combining is admittedly very crude, 
and would not be expected ordinarily to give more 
than a rough approximation to a comprehensive index. 
But it happens in this case that the five commodities 
on which the work is based are very dependable and 
typical as far as production is concerned. The index 



TABLE XV 

INDEXES OF PHYSICAL PRODUCTION, UNITED STATES, 

1870-1920 


YEAR 

CROPS 

MINERALS 

GENERAL 

AGGREGATE 

PER CAPITA 

1870 

38 

5 

25 

61 

1871 

33 

5 

22 

53 

1872 

38 

7 

25 

60 

1873 

36 

7 

24 1 

56 

1874 

34 

6 

23 

52 

1875 

45 

6 

29 

64 

1876 

44 

5 

28 

60 

1877 

48 

6 

31 

65 

1878 

51 

6 

33 

67 

1879 

57 

8 

37 

73 


63 

10 

42 

81 

1881 

47 

11 

33 

61 

1882 

62 

13 

42 

78 

1883 

56 

13 

39 

69 

1884 

64 

13 

43 

76 

1885 

63 

13 

43 

74 

1886 

61 

17 

43 

72 

mbs . 

57 

19 

42 

69 


67 

20 

48 

77 

■ 

73 

23 

53 

83 

m ! " SI H 

59 

27 

46 

71 

1891 

78 

26 

57 

86 

1892 

62 

29 

49 

72 

1893 

59 

24 

45 

66 

1894 

58 

24 

44 

63 

1895 

72 

31 

55 

77 


76 

31 

58 

79 

. . 

76 

34 

59 

79 


81 

39 

65 

85 

1899 

77 

45 

64 

83 

1900 

78 

46 

65 

83 

1901 

71 

51 

63 

78 

1902 

92 

57 

78 

95 

1903 

84 

58 

74 

88 

1904 

92 

57 

78 

91 

1905 

97 

74 

88 

100 

1906 

107 

80 

96 

108 

1907 

93 

80 

88 

97 

1908 

100 

59 

83 

90 

1909 

99 

85 

93 

99 

1910 

100 

88 

96 

100 

1911 

100 

80 

92 

95 

1912 

112 

98 

106 

108 

1913 

100 

100 

100 

100 

1914 

112 

81 

100 

98 

1915 

115 

101 

109 

106 

1916 

94 

136 

110 

106 

1917 

104 

133 

115 

109 

1918 

102 

137 

116 

108 

1919 

111 

102 

107 

99 

1920 

115 

110 

113 

103 


81 










82 INTRODUCTION TO ECONOMIC STATISTICS 


obtained from them in fact checks remarkably well 
with Stewart’s (1890-1920) and Day’s (1899-1920) in- 
dexes of physical production. The results, put into 
index form, are shown in Table XV. 

Price Indexes. The theory of price indexes 1 may be 
advantageously reviewed in the light of the principles 
discussed in this chapter. It will be apparent that a 
price index, to be theoretically precise, must conform 
to the equation, 



Since this equation is valid whether the indexes sub- 
stituted in it have been reduced to a common base or 
expressed merely in dollars and “dollar’s worths,” it 
may be written, by the use of formulas previously con- 
sidered : 

•p -^PnC[ n 

rn ~ 2p m q n 

The index numbers obtained by the direct use of this 
formula will have as their base approximately an aver- 
age index, as determined by the use of average prices 
in the denominator. If, however, the indexes of value 
and quantity have first been reduced to a given base, 
then the price indexes will have that base. 

'The theories of price and quantity indexes discussed in this book 
involve the use of the same list of items from one date to another. If 
a change in the number or character of the items is to be made, as the 
change from 22 to 43 items in the Bureau of Labor Statistics ’ index 
of retail food prices, it is assumed that a new series is begun, and is 
connected with the old merely by an adjustment of the new base so 
as to make the indexes at the two overlapping dates ajpee. The difficult 
theoretical problem of constructing an index based^ upon continually 
shifting lists of commodities is not considered, since it has little imme- 
diate practical value. Production indexes, however, to be quite accurate, 
ought to be gradually broadened to take account of the growing diversi- 
fication of industry. Hence data such as freight and canal tonnage, 
which reflect this increasing diversification, are useful in measuring pro- 
duction. 



QUANTITY INDEXES AND THEIR USES 83 

For the purpose of making a comparison with the 
proportional expenditure method, the formula may 
be modified by twice inserting the factor p m in such a 
way that it will cancel from each term of the numerator 
summation, as follows: 

2 X Pm(Jn 

p _ P m 

n “ 2p m q n 

The formula as thus written indicates relatives based 
upon standard prices, and an averaging of these rela- 
tives with weights consisting of contemporary physical 
quantities measured in units of a dollar’s worth at 
standard prices. Briefly stated, the formula means 
relatives weighted for “dollar’s worths.” Of course 
in actual work the simpler form previously stated 
I'would be used; the latter form is given merely for 
comparison and interpretation. Since this method of 
finding a price index involves a standardization of 
quantity units throughout a given period, it may be 
called the method of standard quantities. 

An Approximate Method. The averaging of prices by 
the use of quantity weights suggests an approximate 
method for finding price indexes which may here be 
stated by way of further comparison. This method 
uses actual rather than relative prices, and is weighted 
by the use of the average physical quantities (q m ) 
produced, consumed, or traded during the period of 
time in question. It is analogous to the method used 
in finding quantity indexes. Its formula is : 

Pn — 2p n q m 

The series of indexes so derived may be reduced to any 
desired base. The formula is a convenient one, but 



84 INTRODUCTION TO ECONOMIC STATISTICS 


it is not theoretically valid, and therefore will not give 
very dependable results. It fails to meet the test im- 
plied in the formula : 

PnQn = Vn 

Theoretical Difficulties. Certain theoretical difficul- 
ties encountered in developing a method of finding a 
price index may be revealed by approaching the prob- 
lem from another angle. Suppose, first, that we had to 
deal merely with one commodity having a value for a 
given year of four dollars, and for the succeeding 
year of five dollars. If the first year is taken as a 
base, the price index for the second year is 125. If 
the second year is taken as a base, the price index for 
the first year is 80. These two results are consistent, 
as may be shown by the fact that 80% times 125% is 
unity, or by the fact that the reciprocal of the index 80 
is the index 125. 

Let us now consider an analogous problem involving 
two commodities, with quantities and prices stated, as 
follows : 

FIRST YEAR 

(q) (P) (V) 

Commodity A, 10 lbs. @ $4 = $40 

Commodity B, 3 bu. @ 7 = 21 

Value index 61 

SECOND YEAR 

(q) (P) « (v) 

Commodity A, 12 lbs. @ $5 = $60 

Commodity B, 2 bu. @ 6 = 12 


Value index 


72 



QUANTITY INDEXES AND THEIR USES 85 

Instead of making use of average prices as standards 
in computing quantities, we shall follow a common 
usage and take the prices of the base year. Considering 
the first year as the base, and its prices as the stand- 
ards for finding quantity units (dollar’s worths), we 
have 61 as both the value and the quantity index. The 
•price index, V — Q, is therefore 100, as it should be in 
the base year. In the second year V = 72 and Q = 62, 
the latter result being obtained by summating the 
quantities of the second year at the standard prices 
of the preceding year. The price index for the second 
year is therefore 72% -r-62%, or 116%, as compared 
with 100% for the first year. 

Reversing the computation, we now take the second 
year as the base year, and its prices as the standard 
prices. The price index for the second year will now 
come to 100, since the value index is identical with the 
quantity index. For the first year the value index will 
be 61 as before; and the quantity index, obtained by 
summating the quantities at the standard prices of the 
second year, will be 68. The price index for the first 
year is therefore 61% -M58%, or 90%. 

Are our two results, obtained from different bases, 
consistent with each other, as they were when only 
one price index was used? If so, the product of the 
two indexes, 116% and 90%, should be unity. But 
their product is actually 104%. The other test is that 
the reciprocal of the index 90% should equal the index 
116%. But actually the reciprocal of 90% is 111%. 

It is obvious that the lack of consistency arises from 
the failure to standardize quantity units by the use of 
a constant price. If, now, we apply the method of 



86 INTRODUCTION TO ECONOMIC STATISTICS 


standard quantities, we obtain the following results for 
the two years, respectively : 

ri “ 65 ~ y * _ 67 “ AU ' 

Considering the first year as a base, we obtain an in- 
dex of 114 for the second year. If the second year is 
taken as the base, a consistent result will necessarily 
be obtained. It will be seen that the index for the 
second year thus obtained is nearly midway between 
the two indexes, 116 and 111, obtained before. 

Fisher’s Index. Thus the standard quantity method 
may be used to avoid the inconsistency encountered 
when different standard prices are used in measuring 
quantities. But Professor Fisher has proposed an- 
other solution which he regards as the theoretical ideal. 
His method involves the finding of the geometrical 
average of the two inconsistent results, as seen in the 
two indexes, 116 and 111. This method will here give 
a result of 114, which is equal to that obtained by 
the method of standard quantities, though as a rule 
the results are only approximately equal. The latter 
method has a distinct advantage in that it will give a 
consistent series of index numbers for a required inter- 
val of time, and the series may be reduced to any de- 
sired base. The weight of authority, however, supports 
Professor Fisher’s solution as being theoretically the 
best. His formula, which is somewhat too complex for 
ordinary use, is as follows : 



in which the subscripts o and n indicate a base year 
and a relative year, respectively. The development of 



QUANTITY INDEXES AND THEIR USES 87 

the formula is sufficiently explained in the foregoing 
discussion. The corresponding formula for quantities 
(Qn) may be written by interchanging each p and q 
in the price formula. 

It will hardly be profitable to attempt to follow any 
further the theoretical problems connected with the 
compiling of price and quantity indexes. It must be 
emphasized that at present the more significant prac- 
tical problems relate to the securing of adequate and 
reliable data rather than to the precision of the meth- 
ods used in their elaboration. The choice of method, 
as a rule, will not involve the danger of serious error ; 
while work may be quite invalidated by deficiencies 
and inaccuracies in the data. An understanding of the 
theory of the subject will nevertheless be found to have 
a practical Tr alue in the planning of work and in the 
evaluation of results. 

Quantity Theory Indexes. An interesting applica- 
tion of price and production indexes to the measure- 
ment of general business changes may be made on the 
basis of the quantity theory of money. This theory 
in its simplest form states that prices (P) change 
directly as the circulating medium (M) and its rate of 
circulation (R), and inversely as the number of phys- 
ical units traded (N). This statement is expressed 
algebraically as the following equation of exchange : 



The equation may be elaborated by subdividing the 
circulating medium into its various elements, as money 
and credit, and dealing with each on the basis of its 
own average rate of circulation. But as it is difficult to 



88 INTEODU CTION TO ECONOMIC STATISTICS 


obtain accurate data bearing upon circulation rates, 
such elaboration will not be attempted here. It should 
also be noted concerning the quantity theory that there 
has been much argument as to its validity. But the 
arguments have not been concerned primarily with the 
statistical aspects of the subject, to which little objec- 
tion has been made. They have dealt, rather, with the 
origin of changes in the equilibrium expressed by the 
equation. 

In order to make a certain provisional use of the 
formula, further data must be adduced. Values for 
the term M may be obtained from a study made by 
Professor E. W. Kemmerer, entitled, High Prices 
and Deflation, which gives estimates of the money 
and bank credit in circulation during the war period. 
Brought down to 1920, these estimates ar^ as follows: 


Circulation 
(Millions of dollars) 

Year Money Deposits 

1913 3,390 12,678 

1914 3,505 13,430 

1915 3,682 14,411 

1916 4,159 17,840 

1917 4,914 21,273 

1918 5,579 23,771 

1919 5,793 27,928 

1920 6,060 30,300 


In combining money and deposits into a single index 
of circulating medium, it is necessary to multiply de- 
posits by two because they circulate, in the form of 
checks, about twice as fast as money. The numbers 



QUANTITY INDEXES AND THEIB USES 89 

obtained by addition may then be reduced to a 1913 
base. For the value of N, index numbers of physical 
production will here be substituted. It is assumed that 
this may be done because, as a rule, the volume of 
trade parallels the volume of production. For the 
value of P, the Bureau of Labor Statistics’ index of 
wholesale prices will be used. 

An index of the rate of currency circulation may 
now be readily obtained algebraically by the use of the 
equation of exchange expressed in the form 


The resulting index will not, of course, register at all 
accurately the percentage changes in the rate of cur- 
rency circulation, owing to the fact that changes in the 
rate of circulation of goods have been omitted by the 
substitution of volume of production for volume of 
trade. But it should show changes in the activity of 
business by registering the increases or decreases of 
monetary circulation relative to corresponding changes 
in the circulation of goods. That is, it should indicate 
the relative activity of the consumption market. 
Speculative flurries, which increase the circulation of 
both currency and securities, will therefore not regis- 
ter at all. That the index does, in fact, show changes 
in the activity of business from year to year with some 
degree of certainty, is apparent from the figures. The 
depression of 1914, the war activity of 1917, the tem- 
porary slump of 1919, and the boom culminating in 
1920 are all indicated. 

The indexes entering into the equation of exchange 
as just discussed are given below. Taken together, 



90 INTRODUCTION TO ECONOMIC STATISTICS 

they present an epitome of the general movement of 
business. Prices vary directly with purchasing power, 
MR (demand), and inversely with production, N (sup- 
ply). It should be observed that the values of R and N 
here used are mere approximations. They are prob- 
ably too high in 1914 and 1915, and too low in 1916. 


Year 

(P) 

(M) 

(R) 

(N) 

1913 

100 

100 

100 

100 

1914 

100 

106 

94 

100 

1915 

101 

113 

97 

109 

1916 

124 

139 

98 

110 

1917 

176 

165 

123 

115 

1918 

196 

185 

123 

116 

1919 

212 

214 

106 

107 

1920 

243 

232 

118 

113 


A complete statistical verification of the equation of 
exchange would necessitate ampler data and more ex- 
tended computations. The point of greatest difficulty 
would be found to be the construction of an index of 
the physical volume of trade (N). Theoretically this 
should include all wealth and services traded in a given 
period of time, measured in physical units of “dollar’s 
worths” at average prices for the whole interval over 
which the computation extends. Since MR is equal to 
the value of the same quantities at current prices, the 
equation of exchange reduces to the formula for find- 
ing price index numbers by the standard quantities 
method ; thus : 

p _ MR _ 2p n q n 
r “ — N 2p m q D 

Of course in actual work a process of sampling must 
necessarily be employed, and various methods for ag- 



QUANTITY INDEXES AND THEIR USES 91 

gregating quantities may be resorted to. The price 
index thus obtained would theoretically vary somewhat 
from the usual type of index, because the quantities in- 
dicated in the formula refer to trade rather than to 
production or consumption. Hence goods in which 
there is unusually active speculation will in effect be 
weighted heavily. 

For discussions of the quantity theory and statis- 
tical verifications of the equation of exchange the stu- 
dent may profitably consult Kemmerer, Money and 
Credit Instruments in their Relation to General Prices; 
Fisher, The Purchasing Power of Money; and articles 
published by King in the weekly service of the 
Bankers’ Statistics Corporation (1920). 

Measurement of the National Income. We have thus 
far considered production indexes principally with ref- 
erence to their use in connection with price indexes. 
But they are also of direct significance in that they 
indicate changes in the total national income. By 
national income is meant the sum of all consumption 
values in economic goods, services, and property 
usances, together with increases in stores of goods and 
extensions of property. This conception differs slightly 
from aggregate income as privately reckoned, inas- 
much as the latter includes increases in capitalized 
values, particularly of land. 

While production indexes show changes in the na- 
tional income, they serve merely as ratios unless sup- 
plemented by estimates of the total income for one or 
more years. Several such studies have been attempted, 
but the most complete and authoritative study is one 
recently made by the National Bureau of Economic 



92 INTRODUCTION TO ECONOMIC STATISTICS 


Research. This Bureau is a private organization 
chartered in 1920 for the purpose of conducting statis- 
tical investigations into subjects affecting the public 
welfare. It is controlled by a board of nineteen di- 
rectors representing various points of view and in- 
terests. Its staff consists of Wesley C. Mitchell, Will- 
ford I. King, Frederick R. Macauley, and Oswald W. 
Knauth. This organization has recently issued a mon- 
ograph containing estimates of the national income 
for each of the years 1909-1918, inclusive. The mono- 
graph is worthy of careful study, not only for its con- 
clusions, but also as an excellent example of applied 
statistical methods. The estimates of income were 
made independently on two different bases; first, by 
sources of production, and second by incomes re- 
ceived. The two sets of results agree very closely, 
the maximum difference being 6.9% in 1913. They 
were averaged to give the final estimates, which were 
reduced to terms of 1913 prices. As thus reduced, the 
indexes were as follows, the income for 1913 being 
34.4 billion dollars : 


Yeab 

Index 

1909 

88 

1910 

94 

1911 

92 

1912 

97 

1913 

100 

1914 

96 

1915 

102 

1916 

118 

1917 

119 

1918 

113 


The Distribution of Income. The National Bureau 
of Economic Research has also made a study of the 
individual distribution of income in 1918, based in 



QUANTITY INDEXES AND THEIR USES 93 


large part on income tax returns. The total income 
for that year was found to be about 58 billion dollars, 
and the number of personal income recipients was 
37.6 million, exclusive of men in active service in re- 
spect to both items. 1 The average income was $1,543 ; 
the mode, $957 ; and the three quartiles were $833, 
$1,140, and $1,574, respectively. The distribution is 
summarized in the following derived tables : 

FREQUENCY DISTRIBUTION OF NATIONAL INCOME IN PER- 
CENTAGES OF CIVILIAN INCOME RECIPIENTS, U. S., 1918 


Income class 

Per cent 

Under $400 

2.84 

$ 400- 800 

19.51 

800-1200 

32.16 

1200-1600 

21.53 

1600-2000 

9.88 

2000-2400 

4.64 

2400-2800 

2.63 

2800-3200 

1.61 

3200-3600 

1.07 

3600-4000 

.74 

4000 & over 

3.39 


CUMULATIVE PERCENTAGES OF CIVILIAN INCOME RECIPI- 
ENTS AND OF NATIONAL INCOME, U. S., 1918 


Recipients 

Aggregate 

of incomes 

incomes 

10 

2.7 

20 

7.2 

30 

12.5 

40 

18.7 

50 

26.0 

60 

33.6 

70 

42.2 

80 

52.5 

90 

65.2 

99 

86.2 

100 

100.0 

are included, the average income becomes $1490. 



94 INTRODUCTION TO ECONOMIC STATISTICS 


The first of these tables may be graphed as a fre- 
quency curve, omitting the last class of “ $4,000 and 
over,” which if accurately represented should be ex- 
tended to include incomes of several millions. 1 The 
second may be graphed as a Lorenz curve, the first 
column being represented on the base line, and the 
vertical scale being drawn equal to the horizontal scale. 
A diagonal drawn upward from left to right — the line 
of equal distribution — will serve as a basis of compari- 
son. Or, the second table may be more simply graphed 
by obtaining by subtraction the non-cumulative per- 
centages of aggregate income, and representing them 
as successive vertical blocks above a base line on which 
are measured the successive ten per cent groups of 
income recipients. 

Pareto’s Law. If the percentages in the fre- 
quency distribution are summated in reverse order, 
and are plotted vertically against the corresponding 
lower class limits on double-logarithmic paper (double 
cycle on each scale), the so-called Pareto’s law of in- 
come distribution will be illustrated. This law states 

1 If the frequency distribution of incomes is graphed, as repre- 
sented, on semi-logarithmic paper, with the number of incomes (income 
recipients) plotted on the vertical arithmetic scale, and the magnitude 
of the incomes plotted on the horizontal logarithmic scale, a somewhat 
regular frequency curve is formed. This indicates a type of distribu- 
tion which is normally symmetrical when the ratio departures from 
the geometric mean, rather than the differences from the arithmetic 
mean, are taken as the basis for measuring the dispersion. In such a 
distribution the median is therefore approximately the geometric mean 
of the first and third quartiles, instead of the arithmetic mean. 



0 — 



QUANTITY INDEXES AND THEIR USES 95 

that the curve of income distribution above the mode, 
when logarithmically plotted, approximates a straight 
line. Much the same result may also be obtained by 
transferring the Lorenz curve to double-logarithmic 
paper ; or it may be more simply evidenced by plotting 
the original frequency curve on similar paper. Pa- 
reto’s law merely calls attention to the type of fre- 
quency curve normally followed by income data. Thi§ 
is a curve which when graphed in the ordinary manner 
is seen to be strongly skewed to the right — somewhat 
the same type as was suggested by the wage data 
studied in the first two chapters. In formulating the 
law Pareto thought he had discovered a somewhat in- 
flexible and permanent fact of economic relationships. 
While he doubtless over-emphasized the inflexibility of 
the law, yet he nevertheless pointed out an interesting 
illustration of the strong tendency to statistical regu- 
larity inherent in biological and social phenomena. 

Income in Other Countries. The National Bureau 
of Economic Research has also made estimates of the 
per capita income in several countries for the year 
1914. These estimates are based upon prior studies 
made by a well-known English authority, Sir Josiah 
Stamp. The Bureau’s data give the following ratios: 


United States 

100% 

Australia 

79 

United Kingdom 

73 

Canada 

58 

France 

55 

Germany 

44 

Italy 

33 

Austria-Hungary 

30 

Spain 

16 

Japan 

9 



96 INTRODUCTION TO ECONOMIC STATISTICS 

The Distribution of Property. A problem closely 
related to the distribution of income is the distribution 
of property ownership. For the United States the best 
known investigation of the subject is one published by 
Professor W. I. King in 1915. The total wealth for 
1910 was estimated to have been about 200 billion dol- 
lars, and its ownership was distributed in a form that 
is suggested by the curve of income. The richest two 
per cent of the families were thought to own over fifty 
per cent of the property. But the data on which such 
estimates rest involve many uncertainties, and the 
meaning of the results is further obscured by lack of 
knowledge of the rate at which fortunes are acquired 
and dissipated. 

REFERENCES 

National Bureau of Economic Research, Income in the United 
States, Volume I. 

Day, Edmund E., “An Index of the Physical Volume of Pro- 
duction,” Review of Economic Statistics, September, 1920- 
January, 1921. 

Fisher, Irving, “The Best Form of Index Numbers” (and 
discussion), Quarterly Publications of the American Statis- 
tical Association, March, 1921, pp. 533-551. 

Fisher, Irving, The Purchasing Power of Money. 

Hoffman, F. L., “The Economic Progress of the United States 
During the Last Seventy-five Years,” Quarterly Publica- 
tions of the American Statistical Association, December, 
1914, pp. 294-318. For further data on the same topic, see 
Appendix III. 

Ingalls, W. R., “Labor the Holder of the Nation’s Wealth and 
Income,” The Annalist, September 13, 20, and 27, 1920. 
King, W. I., Wealth and Income of the People of the United 
States. 

Meeker, Royal, “On the Best Form of Inijex Numbers,” 
Quarterly Publications of the American Statistical Asso- 
ciation, Sept., 1921, pp. 909-915. 

Stewart, Walker W., “An Index Number of Production” 
(and discussion), American Economic Review, March, 1921, 
pp. 57-81. 



QUANTITY INDEXES AND THEIR USES 97 

Walsh, C. M., The Measurement of General Exchange Value. 

Working, Holbrook, “ What is to be the Future Price Level V 9 
The Annalist , June 27, 1921, p. 686. 

EXERCISES 

1. From Tables XIV and XIV-A (pages 76 and 77) find an 
index of value production (a) for crops and (b) for min- 
erals. Reduce each index to a 1913 base. 

2. Divide each item in the index of value production of 
minerals just obtained, by the corresponding item in 
the index of physical production of minerals in the 
United States. Explain the results. 

3. Divide each item in the index of value production of 
crops by the corresponding item in the index of physical 
production of crops. Explain the results. 

4. Plot on semi-logarithmic paper the index obtained in the 
preceding exercise, together with the index of whole- 
sale prices in the United States. Explain the divergence 
of the two trends. 

5. “The Monthly Review / 9 issued at the Federal Reserve 
Bank of New York, gives the value of the ten leading 
crops in the United States at average (standard) prices 
for recent years as follows : 


Year 

Value 
(millions of 
dollars) 

1910 

5,873 

1911 

5,491 

1912 

6,549 

1913 

5,750 

1914 

6,397 

1915 

6,831 

1916 

5,907 

1917 

6,507 

1918 

6,418 

1919 

6,626 

1920 

7,284 

1921 

6,118 


From these data derive an index of agricultural produc- 
tion (base, 1913), and compare it graphically with the 
corresponding index computed from three leading crops 
(Table XV, page 81). 



98 INTRODUCTION TO ECONOMIC STATISTICS 


6. From the data of crops and prices given below, find index 
numbers of physical production, value production, and 
prices, for the year 1914. Express each index in terms 
of 1913 as a base. (It is assumed that the prices of 1913 
represent average prices for a certain period.) 


Wheat Corn Cotton 


Year 

Bu. 

Price 

Bu. 

Price 

Bales 

Price 

1913 

760 

$1.05 

2450 

$0.72 

14 

$61.00 

1914 

890 

1.10 

2670 

.80 

16 

53.00 


7. 


From the following data find indexes of the four terms 
used in the equation of exchange (quantity theory) for 
the year 1910, taking the year 1900 as the base. 


Year: 1900 

Price index 80 

Money in circulation 2.2 

Deposits subject to cheek 9.4 

Physical production 20 


1910 

100 

3 (billions) 
14.25 
30 


8. Plot together on semi-logarithmic paper the index of 
wholesale prices in the United States for 1890 to 1920 
inclusive, and a similar index (base, 1913) of per capita 


circulation derived from the following data : 


1890 

$22.82 

1905 

$31.08 


23.45 


32.32 


24.60 


32.22 


24.06 


34.72 


24.56 


34.93 

1895 

23.24 

1910 

34.33 


21.44 


34.20 


22.92 


34.34 


25.19 


34.56 


25.62 


34.35 

1900 

26.93 

1915 

35.44 


27.98 


39.29 


28.43 


45.74 


29.42 


50.81 


30.77 


54.33 



1920 

57.04 


9. The two indexes of physical production in the United 
States (1913-1920) given below, are adapted (a) from 
Day's and (b) from Stewart's comprehensive studies. 



QUANTITY INDEXES AND THEIR USES 99 

Using data for prices and circulating medium cited in the 
discission of the quantity theory of money, derive two 
indexes for B, 1913-1920, and compare them with the 
corresponding index included in the data just men- 
tioned. 


Year 

(A) 

(B) 

1913 

100 

100 

1914 

98 

100 

1915 

105 

111 

1916 

111 

116 

1917 

114 

123 

1918 

113 

124 

1919 

106 

119 

1920 

111 

122 



CHAPTER V 


TRENDS AND CYCLES 

The Nature of Trends. The interpretation of a time 
series of index numbers usually requires the determi- 
nation of the trend. A trend is a derived series of in- 
dex numbers following the general course of the given 
items, but shortening or eliminating the fluctuations. 
Its significance is best grasped by means of a graphic 
representation. An inspection of such graphic work 
will show that trends may vary from straight lines on 
the one hand, to curves almost conforming to the 
given items on the other. Whether a trend is drawn 
as a straight or an irregular line depends in part on 
the nature of the data and in part on the purpose to be 
served. 

The Free-hand Method. A good draftsman com- 
monly determines trends merely by charting his data 
and, without any preliminary computations, drawing 
a straight or curved median line through them . 1 Such 
work may be done entirely free-hand, or by the use 
of irregular curves and other drafting material, but in 
either case it is classed as a free-hand method. For 
many purposes this method will prove 1 satisfactory, 
and should be practiced by the student until it can be 
used with facility. For even though more elaborate 

‘If a trend based upon geometric means is desired, the data may be 
plotted on semi-logarithmic paper. 

100 



TRENDS AND CYCLES 101 

methods are to be applied in actual work, practice in 
the free-hand drawing of trends will be helpful. It 
will develop an appreciation of the requirements of a 
given problem, without which the best of mathematical 
methods are likely to be misapplied. 

An example of the free-hand method of drawing a 
trend is given in the graph of the index of real wages, 



Figure 6. Trends and cycles. Upper line, index of real wages (hour 
rates) in the United States (see Table X) and its trend; middle line, 
index of per capita production (see Table XV) and its trend; lower 
line, cycles of wholesale prices (see Figure 9). 

Reprinted, with permission, from The Quarterly Journal of the Univer - 
8%ty of North Dakota , January, 1922. 


in Figure 6. In this case a more precise method would 
not be applicable because of the inaccuracies caused 
by the substitption of wholesale prices for the cost of 
living in the computation. As was previously stated, 
there is reason to think that the marked rise in real 
wages appearing during the decade 1890-1900 is an 
exaggeration. The trend was therefore drawn so as to 



102 INTRODUCTION TO ECONOMIC STATISTICS 

discount this rise. As a result it does not conform to 
the usual rule that the deviations above and below it 
should balance. 

Method of Semi-Averages. When a straight-line 
trend is to be drawn through a fairly long series, the 
free-hand method may be improved upon by means of 
a simple calculation. The average of the series may 
be plotted on the middle ordinate, and the straight- 
line trend drawn by inspection through this point. 
The plus and minus deviations must then necessarily 
balance. Or, better still, the series may be divided 
into two equal parts and an average taken for each 
part. These averages may be plotted on the middle 
ordinate of each half series, respectively, and the trend 
drawn through the two points. If the series consists 
of an odd number of items, the middle item and unit 
of time may be divided between the two parts. This 
method of semi-averages is the one that was used in 
drawing the trend of per capita production in Figure 
6. The trend thus obtained is, in a long series, nearly 
identical with the so-called line of least squares to be 
discussed later. 

The Moving Average. Among the trends that are 
found by mathematical methods, perhaps the best 
known is the moving average. The process of com- 
puting the moving average is simple in principle, but 
it is usually tedious in practice even when calculating 
machines are used. By this method the position of 
the trend at any given period in the series is found 
by averaging a certain number of items centering at 
that period. Just how many items should be included 
in the average must be determined by the nature of 



TRENDS AND CYCLES 103 

the data. If the series shows a cyclic movement of 
known length, then by taking the number of items 
covering this length of time, the cycles will be 
smoothed. If monthly data having a pronounced sea- 
sonal swing are being studied through an interval of 
several years, a twelve-month moving average is ap- 
propriate. 1 The method of finding the moving aver- 
age is illustrated in the following table : 


Year 

Per Capita 
Production 

Moving Averages 
3- Year 5- Year 

1910 

100 



1911 

95 

101 


1912 

108 

101 

100 

1913 

100 

102 

101 

1914 

98 

101 

104 

1915 

106 

103 

104 

1916 

106 

107 

105 

1917 

109 

108 

106 

1918 

108 

105 

105 

1919 

99 

103 


1920 

103 




In explanation of the foregoing table it may be said 
that the first number in the three-year moving average 
(101) is obtained by averaging the first, second, and 
third items (100, 95, and 108). The second number 
(101) is obtained by averaging the second, third, and 
fourth items (95, 108, and 100). The results are writ- 
ten to the nearest unit, and are placed opposite the 
middle one of the three items averaged. In the same 
way the succeeding averages are derived. The five- 
year moving average is similarly computed, except 

1 The twelve-month moving average centers between the sixth and 
seventh month in each computation. In order to make the trend thus 
obtained fall on the same ordinates as the original items, it is neces- 
sary to adjust it by deriving from it a two-month moving average. The 
deviations of the original items may then be readily obtained. 



104 INTRODUCTION- TO ECONOMIC STATISTICS 

that five items are averaged at a time. In practice 
the work may be somewhat abridged, after the total 
of the first group of items is found, by deriving the 
next total from it. This may be done by adding to 
the first total the difference between the next item 
about to be included and the one about to be dropped. 



Figure 7. Moving averages. A, Index of per capita production, 
1910-1920. B, Three-year moving average of same. C, Five-year mov- 
ing average of same. 


Thus, the first five items above give 501 (average, 
100). In obtaining the total of the second to the sixth 
items, inclusive, the sixth item (106) will be added 
and the first item (100) will be dropped; that is, a 
balance of six will be added. The new c total is there- 
fore 501 -f 6, or 507 (average 101). In the same way 
the succeeding totals and averages may be derived. 1 

1 Mathematicians sometimes prefer a more complex form of the 
moving average known as the progressive mean. This is similar to 



TRENDS AND CYCLES 


105 


The two moving averages just described, and the 
index on which they are based, are plotted in Figure 
7. The figure will serve to make clear the general rule 
that the more inclusive the moving average, the 
smoother the trend will be. A disadvantage of the 
method will also be observed. The moving average 
derived from a given series of items will always be 
shorter than the series; and the more inclusive it is, 
the shorter it will be. It is possible, however, to find 
a tentative substitute for the lacking items in the trend 
by repeating the extreme items in finding the aver- 
ages. Thus, in the foregoing five-year moving average, 
a trend item for 1911 might have been obtained as fol- 
lows : 

2 X 100 + 95 -f 108 -f 100 _ im 


and for 1910: 

2 X 100 + 2 X 95 -t- 108 
5 


= 100 


This method has the mathematical advantage of mak- 
ing the sum of the trend items equal to the sum of 
the data — a fact which may be found convenient to 
use as a basis for checking the computations. It may 
also be adapted to finding a current trend item for a 
series of index numbers that is kept up to date. 

The Line of Least Squares. If a straight-line trend 
is at all adapted to a given series, the most satisfac- 
tory mathematical trend to use is one known as the 

the moving average^ as illustrated and explained, except that weights 
are used in taking the average. The weights are derived by the binomial 
theorem, and are the frequencies of a theoretical curve of distribu- 
tion. Thus in taking a five-year progressive mean, each group of five 
terms is averaged by applying to the terms in succession the weights 
1:4:6:4:1. Similarly, a seven year progressive mean would make use 
of the weights 1 :6: 15:20: 15:6:1 (cf. Slichter, 'Elementary Mathematical 
Analysis, p. 194). 



106 INTRODUCTION TO ECONOMIC STATISTICS 


line of least squares. The name is derived from the 
fact that the line is so drawn that the square of the 
deviations of the data from it, as measured on the 
ordinates, is always a minimum. The line of least 



Fiourje 8. Straight-line trend. TT', line of least squares for the 
seven points indicated; UU', line of unit slope. 


squares should be thoroughly understood, not only 
because of its usefulness as a trend, but also because 
it is the basis from which the principal method of com- 
puting correlation is developed. In explaining it, the 
following simple illustration will be taken: 


TRENDS AND CYCLES 


107 


Suppose that it is required to find a straight-line 
trend for the index y (see next page), the average of 
which is zero. The data are plotted upon coordinate 
paper — the x and y scales having preferably the same 
unit — as shown in Figure 8. The average of the data 
is made to fall on the x-axis, and the middle item is 
plotted on the y-axis. If we consider the vertical 
distance of each index from the x-axis to represent a 
force bearing upon that line, then the total moments 
of these forces will be expressed by the sum of the 
xy’a. This sum may be compared with the sum of the 
moments of a line passing through the same ordinates, 
and forming an angle of forty-five degrees with the 
x and y axes at their point of intersection. 1 Such a 
line is said to have a unit slope; that is, it rises one 
unit (y) for each unit (x) to the right. Its slope may 
also be expressed by saying that the tangent of its 
angle (TJOX) is unity. The sum of its xy’s is, of 
course, identical with the sum of its x’a squared. The 
slope of the line of least squares is found by compar- 
ing the moments of the data with those of the line 
of unit slope, as expressed in the formula : 


S = 


2xy 

S X 2 


in which, 

S = slope of the line of least squares, or tangent of 
its angle 

x — position of items relative to middle ordinate 
y = items, as given 

The data (y), their moments (xy), the moments 
of the line of unit slope (x a ), and the computation of 


* The angle will vary, of course, if the x and y scales differ. 



108 INTRODUCTION TO ECONOMIC STATISTICS 


the line of least squares, are shown in the 

following 

table : 





y 

X 

X* 

*y 

Trend 

-3 

-3 

9 

9 

-4.5 

-4 

-2 

4 

8 

-3 

-2 

-1 

1 

2 

-1.5 

-1 

0 

0 

0 

0 

1 

1 

1 

1 

1.5 

5 

2 

4 

10 

3 

4 

3 

9 

12 

4.5 

> 

11 1 
o 1 


28 ) 

42 




S = 

1.5 



The last column headed “trend” gives the line of least 
squares. This column is computed from the middle or- 
dinate (0) by adding the slope (S = 1.5) once for each 
successive ordinate in a positive direction, and sub- 
tracting it once for each successive ordinate in a nega- 
tive direction. 

Data having an average of zero have here been 
taken merely for the sake of simplicity; the same 
process with but little modification may be applied to 
any values. The dates or other numbers correspond- 
ing to the items cannot be used, however, but must 
be replaced by an $ scale centering at the middle point 
of the series. The average of the data is found, 
and the trend is computed from this average by suc- 
cessive additions of the slope in a positive direction, 
and subtractions in a negative direction. The method 
is illustrated by the use of the following data, which 
parallel those used in the preceding illustration, except 
that the average is 100. This increase in the values of 



TRENDS AND CYCLES 109 


y disappears from 2xy because it affects equally both 
the minus and the plus items. 


Year 

y 

X 

X 2 

xy 

Trend 

1900 

97 

-3 

9 

-291 

95.5 

1901 

96 

-2 

4 

-192 

97 

1902 

98 

-1 

1 

-98 

98.5 

1903 

99 

0 

0 

0 

100 (average) 

1904 

101 

1 

1 

101 

101.5 

1905 

105 

2 

4 

210 

103 

1906 

104 

3 

9 

312 

104.5 

7 

)700 


28 

) 42 


A 

= 100 


S 

= 1.5 



The following details may be noted: (a) If there are 
an even number of ordinates, the y-axis will lie mid- 
way between the two middle ordinates, which are num- 
bered as -0.5 and 0.5 respectively. The horizontal 
positive scale will therefore read 0.5, 1.5, 2.5, etc., and 
the negative scale will be the reverse, (b) It will some- 
times be found that the value of 8 is negative. This 
indicates a downward slope of the line of least squares, 
(c) The position of the line of least squares is de- 
scribed by designating the period coinciding with the 
y axis as the point of origin, and by expressing the 
value of y algebraically in terms of the average and 
the slope. Thus in the above illustration the point of 
origin is 1903, and the equation of the trend is y = 100 
4- 1.5X. 1 

‘It has been suggested that the method of least squares might be 
applied to the finding of a price index (Quarterly Journal of Economics , 
August, 1921, page 567). The expenditure for any given commodity 
may be plotted on a coordinate chart as the value of y , and the num- 
ber of units bought may be plotted as the value of x. The slant of a 
line (tangent of the base angle) drawn from the intersection of the 
axes to tiie point determined by the values of x and y, represents 
the price. The average price of a number of commodities may be 



110 INTRODUCTION TO ECONOMIC STATISTICS 


The Parabola Trend. A broad treatment of the sub- 
ject of curve fitting would lead the student beyond the 
range of ordinary statistical work. We shall not, 
therefore, follow the subject farther, except to take up 



Figure 9. Index of wholesale prices in the United States (see Table 
X) and trend. Indexes for 1870-1880 converted to a gold basis. 


a simple method of adjusting a parabola to an index. 
The method is one which is often used by engineers, 
and has also recently come into use to some extent 
among statisticians. 

taken as the slant of the line of least squares determined by all the 
coordinate pairs of % and y, and having the intersection of the axes 
as the point of origin (y — Sx). In such a case, of course, all the 
values of xy will be positive. To give definite comparisons at different 
dates, this method would require the use of “adllar’s worths” as 
physical units. While the method is ingenious, it is of questionable 
validity, since in effect it involves a weighting of the prices by the 
square of the quantities in the process of finding the average. A 
defense of the method on the basis of the use of least squares in the 
theory of errors does not appear to be valid, since the theory of errors 
would call for merely the arithmetic mean of the number of determina- 
tions. 



TRENDS AND CYCLES 111 

The method of adjusting a parabola will be illus- 
trated by applying it to the Bureau of Labor Statistics’ 
wholesale price index for the years 1896 to 1915 inclu- 
sive, as charted in Figure 9. This figure includes also 
the same index for the years 1870-1895 (gold prices), 
to which a line of least squares has been fitted. But it 
may be easily seen that a similar trend would not be 
suited to the succeeding index numbers. The some- 
what regular curve of the latter portion of the index 
indicates that a parabola of the second order would 
be appropriate. 

In fitting the parabola, the year 1895 has been taken 
as the point of origin, though this year is not included 
in the results. It is estimated by inspection of the 
graph that the trend, if extended to 1895, would have 
a value of 64 at that date. Two other points deter- 
mining the trend may be similarly located, one at 
about the middle of the series and one at the end. 
These points have been taken as 88 for the year 1905 
(x = 10), and 102 for the year 1915 (x = 20). We 
have, then, these coordinate values of x and y: 


If x = 0, 

y = 

64 

If x = 10, 

y = 

88 

If x = 20, 

y = 

102 


The equation of a parabola of the second order is, 
y = a + bx -f- cx 2 

If the coordinate values of x and y as just stated are 
substituted successively in this equation, the following 
results will be obtained: 

64 = a 

88 = a 10b -f- 100c 
102 = a -f 20b + 400c 



112 INTRODUCTION TO ECONOMIC STATISTICS 


Solving for the constants gives, 

a = 64 
b = 2.9 
c= -.05 


Substituting these values in the original equation gives 
the equation of the required trend, 


y = 64 -f 2.9x -.05x 2 


from which the value of each item in the trend may be 
found by substituting the coordinate value of x. 

Since trends are used as a basis for measuring fluc- 
tuations, the deviations of the data from the trend 
are usually computed. This is done by subtracting 
each item in the trend from the corresponding item 
in the data. If the trend has been accurately con- 
structed, the positive and negative deviations should 
be practically equal ; that is, their sum should be zero. 
Where the parabola has been used, a certain error will 
probably be found to have resulted from the fact that 
the original points determining the curve were located 
merely by inspection. An adjustment (centering) to 
remove this error may be made by finding the sum of 
the deviations (2D), dividing it by the number of the 

items (N), adding the result to each of the trend 

items, and subtracting it from each of the deviations. 
This is expressed in the equation of the trend simply 


by adding the correction 



to the value of a. 


In 


work in correlation, however, the correction may be 
more easily made by another method, as will be evi- 
dent later. When comparisons of the fluctuations in 
different series are to be made, either by graphing or 



TRENDS AND CYCLES 


113 


by the computation of a coefficient of correlation, it is 
often necessary to find the standard deviation. For 
graphic representation, the deviations are reduced to 
multiples of the standard deviation, which serves as a 
comparable unit. Table XVI shows the derivation of 

TABLE XVI 

DERIVATION OF TREND AND CYCLES OF WHOLESALE PRICES, 
BASED ON BUREAU OF LABOR STATISTICS INDEX, 
UNITED STATES, 1896-1915 

Equation of trend, y = 64 -f 2.9x — .05x* Point of origin, 1895. 
Equation, corrected, y = 63.825 2.9x — .05x a 


YEAR 

PRICE 

INDEX 

X 

TREND 

7 

D 

D 

CENTERED 

D* 

CYCLES 1 
D/<j 

1896 

66 


66.85 

-.85 

-.68 

.4624 

-.33 


67 


69.60 

-2.60 

-2.42 

5.8564 

-1.17 

1898 

69 


72,25 

-3.25 

-3.08 

9.4864 

-1.49 

1899 

74 


74.80 

-.80 

-.62 

.3844 

-.30 

1900 

80 

5 

77.25 

2.75 

2.92 

8.5264 

1.41 

1901 

79 

6 

79.60 

-.60 

-.42 

.1764 

-.20 


85 

7 

81.85 

3.15 

3.32 

11.0224 

1.60 


85 

8 

84.00 

1.00 

1.18 

1.3924 

.57 

1904 

86 

9 

86.05 

-.05 

.12 

.0144 

.06 

1905 

85 

10 

88.00 

-3.00 

-2.82 

7.9524 

-1.36 

1906 

88 

11 

89.85 

-1.85 

-1.68 

2.8224 

-.81 

1907 

94 

12 

91.60 

2.40 

2.58 

6.6564 

1.25 

1908 

91 

13 

93.25 

-2.25 

-2.08 

4.3264 

-1.00 

1909 

97 

14 

94.80 

2.20 

2.38 

5.6644 

1.15 

1910 

99 

15 

96.25 

2.75 

2.92 

8.5264 

1.41 

1911 

95 

16 

97.60 

-2.60 

-2.42 

5.8564 

-1.17 

1912 

101 

17 

98.85 

2.15 

2.32 

5.3824 

1.12 

1913 

100 

18 

100.00 

0 

.18 

.0324 

.09 

1914 

100 

19 

101.05 

-1.05 

-.88 

.7744 

-.43 

1915 

101 

20 

102.00 

-1.00 

-.82 

.6724 

-.40 


16.40 17.92 20)85.9880 8.66 

-19.90 —17.92 -8.66 


2 0)-3.50 0. <j* = 4.2994 0. 

K = - .175 <j = 2.07 


1 Any set of deviations taken from an average or trend, and intended 
to measure cyclic movements, are commonly designated as cycles. They 
need not necessarily be reduced to units of the standard deviation. In 
some eases no complete cyclic movement may be discovered, but the 
same designation may be used. 















114 INTRODUCTION TO ECONOMIC STATISTICS 


the trend just discussed, together with the correction, 
and the reduction of the deviations to multiples of the 
standard deviation. Figures 10 and 11 show price 
cycles as thus measured compared with other cycles 
similarly obtained. 

The parabola may be used where compound curves 
are required by adding the term dx 8 , and perhaps ex 4 
to the equation. For each term thus added, an addi- 



H96 t 9 oo m m m 


Figure 10. Cycles of wholesale prices (solid line) and commitments 
to New York State prisons (broken line). 

Reprinted, with permission, from The Quarterly Journal of the Uni- 
versity of North Dakota , January, 1922. 

tional point may be located by inspection, and the 
trend may thus be more exactly fitted. But the work 
of solving and applying such equations becomes very 
laborious. With practice, however, the student will 
find ways of abbreviating the process and modifying 
it to suit his purposes. Often the terms of the equa- 
tion may be estimated by simple experimentation. The 
position of the point of origin may be varied to suit 
given requirements. A compound curve shaped some- 


TRENDS AND CYCLES 


115 


what like an italic / may be obtained by using only odd 
numbered powers of x in the equation, and taking the 
point of origin near the middle of the original series. 
With a little ingenuity, sine curves and other trends 
may be experimentally fitted . 1 



1887 1890 1895 1900 m 


Figure 11. Cycles of wholesale prices (solid line) and marriage 
rate (broken line) in the United States. 

Reprinted, with permission, from The Quarterly Journal of the Uni- 
versity of North Dakota, January, 1922. 

Analyzing Business Barometers. A somewhat in- 
tricate problem in trends is met when monthly data 
are employed as indexes, or barometers, of business 
conditions. Since such barometers are very generally 
consulted as guides to business activities, their inter- 

s When it appears that the trend of an index aeries increases or 
decreases by approximately a fixed ratio, an exponential curve may 
be readily fitted as follows. Find the logarithms of the data, plot 
them, and construct? a straight-line trend by means of the semi-aver- 
ages, or by a line of least squares. Read the items of the trend from 
the chart, consider them to be logarithms, and find the corresponding 
numbers. These numbers will be the items of the required trend. In 
a long series the work may be abbreviated by grouping the original 
items, as by decades, and fitting the curve to the averages of these 
groups. 


116 INTRODUCTION TO ECONOMIC STATISTICS 


pretation is a matter of great practical importance. 
The difficulty involved in their use lies in the com- 
plexity of the influences playing upon them. For con- 
venience of analysis these influences have been clas- 
sified as (1) a seasonal variation usually due to the 
dependence of industry upon weather conditions, illus- 
trated by the rising of the interest rate with the move- 
ment of crops, (2) a cyclic movement covering an in- 
terval of several years and marked by alternating in- 
dustrial depression and activity, and (3) a secular 
trend or gradual change due in most cases to the 
growth movement, as seen in the increase in produc- 
tion. In addition to influences which may be appro- 
priately classed under one of these three headings, 
there are others that must be looked upon as more or 
less accidental interruptions, of which no exact account 
can be taken. 

Measuring Seasonal Variations. The most sat- 
isfactory method of analyzing monthly business ba- 
rometers is first to compute an index of seasonal va- 
riations, and then to subtract it, month by month, 
from an index of the data based upon the secular 
trend. The result is an index of the cycles. As has 
already been noted, the twelve-month moving average 
is sometimes assumed to measure the data as distinct 
from the seasonal variations. But such a method of 
elimination takes into account the fluctuations of only 
one year at a time. Other methods have therefore 
been resorted to with the purpose of measuring the 
seasonal swing more exactly on the basis of several 
years. A simple method of this sort, which may be 
considered valid for a period in which the cyclic in- 



GE MONTHLY INTEREST BATE, SHOBT-TIME LOANS, U. S., 1009-1913, AND COMPUTA- 
TION OF SEASONAL VARIATIONS 


TRENDS AND CYCLES 


117 


✓“N 

s-/ 

Q 


© 

SEASONAL 

INDEX 

(paeo^HcoixcOccooo 

©00©©©©©©*Hr-lr-l© 

hHhhh 

100 

i 

i 

© 03 ^ICt^O>r-!e 0 »Ct^O 5 »H 
l> S t> S S N 00 00 00 oo 00 Oj 
^ ^ rtf T*J Tji -TjJ rji Tfi rji 

© 

00 

3? 

5 ? E 
O L 
* ^ 

(AVERAGE) 

030003 CO^CD'tJHrt<Tj<©QO© 
W H ^ CO ^ CO H CO lO O', CO 

tk* id id id id id 

s 

© 

00 

' 

1913 

© © co i> t* o> © © © oq b- oq 
id id id id id id cd cd © id id id 

5.66 

1912 

© t>; tH rH © 1 C (M GO © O O 

rt< cd Tji t* rjJ t*’ id id cd cd cd 

C 3 

GO 

1911 

cq © © © © CO © iq iq © iq 
rb T,i cd cd cd cd rH ^ ^ r# 

t- 

© 

1910 

oq io oo go © 03 oq oq © 
^ Tti rt< rtJ id id id id id ^ 

© 

1—1 

id 

1909 

iq i> t> °o oq © cq tjj cq ^ iq t- 
cd cd cd cd cd cd ^ id id id 

GO 

CO 

Tt<‘ 

! 


January 

February. .* 

March 

April 

May 

June 

July 

August 

September 

October 

November 

December 

Average 



CO© 
b- »H 

ii 


03 r-i 

I I 


00 o 

CO rH 

id 


1911 4.07 0 0 0 

1912 4.82 1 1 4.82 S = 2.28 -h 10 = .228 (annual) 

1913 5.66 2 4 11.32 S-^ 12 = .019 (monthly) 






























118 INTRODUCTION TO ECONOMIC STATISTICS 


fluences are moderate and well distributed along an 
approximately straight-line trend, is illustrated in 
Table XVII. 

The Method of Averages. The problem stated and 
solved in the table is the finding of the seasonal varia- 
tions in the interest rate on the somewhat slender basis 



Figure 12. Average interest rate on commercial paper in the United 
States (lower line) compared with index of wholesale prices (upper 
line). 

Adapted from Monthly Review, New York Federal Reserve Bank. 

of the five years 1909 to 1913, inclusive. The monthly 
rates as given are first averaged both by col umn s and 
rows in order to obtain the annual averages and the 
month averages. The process may therefore be called 
the method of averages. 

If the interest rate had maintained approximately 
the same level from year to year during the interval 



TRENDS AND CYCLES 


119 


studied, nothing more would be needed than to reduce 
the month averages, or types, to a percentage of their 
own annual average. The results would form an index 
of the seasonal variations. But if there is a secular 
movement, it must be canceled from such an index. 
If an interval of half or three-quarters of a century 
is taken into account, a general downward trend of the 
interest rate may be discovered, as may be seen in 
Figure 12. But the five years here studied happen to 
be an exception. By means of the line of least squares, 
a positive slope of 0.019 monthly is revealed. By ap- 
plying this slope to the average of the month types, 
an annual trend is constructed, having its point of 
origin midway between the June and July items. It 
will be noted that one-half the slope must be added to 
the average to obtain the July trend item, and the same 
amount subtracted to obtain the June item. In ob- 
taining the remaining trend items, the slope is applied 
as previously explained. 

Confusion may perhaps here arise from the fact 
that the slope was computed from the five annual aver- 
ages, but was applied to the construction of a trend 
with which to compare the monthly data. But it should 
be obvious that in this case a slope obtained from the 
month averages would be materially affected by the 
seasonal swing. This we wish to retain, while the secu- 
lar trend we wish to cancel. Of course we could find 
the secular trend from the monthly data taken consecu- 
tively as sixty’items, but such a method would be un- 
necessarily laborious. Hence we find it from the five 
annual averages, and apply it to the construction of 
a line that will serve to cancel the secular trend from 



120 INTRODUCTION TO ECONOMIC STATISTICS 


the month types. This cancellation is accomplished by 
dividing the month averages, item by item, by the 
trend. The quotients should be centered, if necessary, 
by reducing them to percentages of their common aver- 
age. The result is an index of seasonal variations, 
from which the percentage deviations, month by month, 
may be directly stated. 1 

Applying the Seasonal Index. The method of 
applying the seasonal index has already been sug- 
gested, and may be described as follows. A secular 
trend for the whole period under consideration is con- 
structed — in this case the line of least squares already 
found may be extended — and the data are reduced 
month by month to percentages of the trend. From 
each month’s item as thus found the seasonal index 
for the same month is subtracted. The remainders 
are assumed to measure the cycles, and may be plotted 
as deviations above and below a horizontal axis. The 
seasonal index may, with caution, be applied to other 
years than those from which it is derived; of course, 
the greater the number of normal years from which 
it is derived, the safer such an extension of its use 
to comparable years becomes. 

The Link-relative Method. A complex but more ac- 
curate method for measuring seasonal variations has 

1 Another variation of the method of averages — one that is per- 
haps theoretically preferable, but whieh involves more extended cal- 
culations — may be briefly described as follows : A twelve-month 
moving average of the monthly data through a given series of years 
is first found. This is adjusted to make it conform to the ordinates 
of the original series by deriving from it a two-month moving average. 
There is then obtained, for each month, the ratio of the original monthly 
item to the corresponding adjusted moving average. The of 

the ratios so obtained for the Januaries is taken as the index of sea- 
sonal variation for January; and index numbers for the other months 
are similarly obtained# The twelve results are then centered, if neees- 



TRENDS AND CYCLES 


121 


been developed by Professor Persons, and applied to 
the analyses appearing in the early numbers of the 
Review of Economic Statistics. In a somewhat sim- 
plified form, this method is illustrated in Table 
XYII-A, which is based on the data of Table XVII. 
Briefly stated, the method consists in finding what are 
called “link-relatives”; that is, the percentage which 
the index of each month is of the preceding month. 
The median link-relatives are then selected from each 
month's series, and are tabulated as the month types. 1 
Beginning with December as a base (100%), the types 
are multiplied consecutively, producing an index series 
from January to December for a typical year. If the 
final December item fails to come out to 100%, in con- 
formity with the base in the preceding December, a 
secular trend is evidently disturbing the index. The 
discrepancy, if moderate, may be removed by distribut- 
ing it throughout the year ; that is, by subtracting one- 
twelfth of it from the January index, two-twelfths 
from the February index, and so on through the year.* 
The results are designated an adjusted index. The ad- 
justed items are next centered by reducing them to 
a base of the average of the series. In making the 
computations the figures were carried to one more 
place than is shown in the table. 

The superiority of this method lies in the fact that 
in taking the link-relatives, and in selecting their 
medians as the month types, the effects of the cyclic 

■ary, by reducing them to percentages of their common average (Cf. 
Jordan, Business Foreeasting, p. 212). 

1 To get the beet results, the median should be based on a larger 
number of years than are here taken. 

* A more exact method is to apportion the discrepancy geometrically; 
that is, to divide the January index by the twelfth root of the final 
December index (written as a decimal), the February index by the 
square of this root, and so on. 



122 INTRODUCTION TO ECONOMIC STATISTICS 























TRENDS AND CYCLES 


123 


movements and of chance influences are minimized. 
The secular trend is satisfactorily eliminated by the 
process of adjusting. The index thus obtained should 
be accurate for the years just preceding the establish- 
ment of the Federal Reserve System. Since that time 
seasonal changes have lessened. 1 

Business Cycles. The Review of Economic Statis- 
tics, in the analyses just referred to, has made exhaus- 
tive studies of the cyclic movements of the commonly 
used business barometers. After measuring the in- 
fluence of seasonal variations by a process similar to 
that just described, it determined the cycles on the 
basis of lines of least squares. It then combined 
twelve of the principal barometric indexes into three 
composite indexes which are taken as measurements of 
speculative activity, business activity, and banking 
strain, respectively. A chart of these three composite 
indexes for the years 1903 to 1913, inclusive, is here 
reproduced (Figure 13). The chart shows very clearly 
the general stages of the business cycle, from the pre- 
dominance of the speculative activity which marks the 
awakening from a period of depression, through the 
period of intensified production, and into the period 
of banking strain which heralds another depression. 
The various barometric series used in constructing the 
figure are indicated in the explanation accompanying 
the title. It should be added that while New York bank 
loans point to the speculative aspects of the cycle, those 
outside New York conform more closely to business 

*For the most exhaustive study of seasonal variations in the interest 
rate, the student should consult “Seasonal Variations in the Belative 
Demand for Money and Capital in the United States” (National 
Monetary Commission, 1910), by E. W. Kemmerer. See also the 
Monthly Review, Federal Reserve Bank of New York, Feb. 1, 1922. 



124 INTRODUCTION TO ECONOMIC STATISTICS 


activity. For practical purposes measurements of the 
general aspects of the cycle need to be supplemented 
by indexes showing the position of particular indus- 
tries, as indicated by relative production and stocks of 
goods on hand. This need is now being met in part by 
the Federal Reserve Bank of New York, and by other 
agencies. 

Several phenomena more psychological than eco- 
nomic in nature show a tendency to fluctuate more or 



Figure 13. The index of general business conditions, 1903-14. 

A, Speculation: New York Bank clearings, average price of indus- 
trial stocks, average price of railroad stocks, and average price of 
railroad bonds. 

B, Business: Bank clearings of the United States outside New York 
City, Bradstreet’s index of wholesale commodity prices, United States 
Bureau of Labor Statistics’ index of wholesale commodity prices, and 
pig-iron production. 

C, Banking: Interest rates on 60-90 day and on 4-6 months com- 
mercial paper in New York City, loans and deposits of New York City 
clearing house banks (both inverted). 

Reproduced from Review of Economic Statistics, by permission of 
the editors. 


less closely with the business cycle. Professor A. H. 
Hansen has recently shown statistically that since 
1898 strikes increased with prosperity, though before 
that date, when the trend of prices was downward, 
they increased with depression (cf. American Eco- 
nomic Review, Dec., 1921, pp. 617-621). Mr. Roger 
Babson has traced a connection between church growth 
and the business cycle; religious activity being ap- 
parently intensified during a period of depression. 



TRENDS AND CYCLES 


125 


Unemployment, failures, suicides, and crime generally, 
also increase during a period of depression. On the 
other hand, immigration, the marriage rate, and ex- 
travagance, tend to increase with a period of prosper- 
ity (cf. Figures 10 and 11, pages 114 and 115). 

As an example of the application of statistical meth- 
ods to the practical problem of forecasting the cyclic 
movements of business, the “Annalist Barometer and 
Business Index Line” may be cited. This index is 
published in graphic form, together with a brief ex- 
planation, each week in the Annalist, a well-known 
financial paper of New York. It is derived from the 
data elaborated by the Review of Economic Statis- 
tics, already briefly described. The index is the recip- 
rocal of a weighted average of the deviations from 
normal of commodity prices, interest rates, pig iron 
production, New York bank clearings, and bank clear- 
ings outside of New York. Since these series of data, 
when directly combined, measure the later phases of 
the business cycle, their decline precedes and fore- 
casts the rise in stocks marking the beginning of the 
next cycle ; and their rise similarly forecasts a decline 
in stocks. By taking the reciprocal of the deviations 
of the combined series, the forecast is made direct 
instead of inverse. By a comparison of the forecast- 
ing index and the movement of stocks in former years, 
the decisiveness of change in the former necessary to 
constitute a forecast of the latter has been determined. 
A detailed account of the 'construction and use of the 
index will be found in the Annalist of March 28 and 
of October 24, 1921. 

The student who desires to inquire more intensively 



126 INTRODUCTION TO ECONOMIC STATISTICS 


into the statistics of the business cycle should consult 
for himself the data published in the Review of Eco- 
nomic Statistics already mentioned. In addition he 
should become familiar with Wesley C. Mitchell’s 
standard work on Business Cycles, and with a 
more recent work by D. F. Jordan on Business Fore- 
casting. In connection with the underlying causes 
of the cycle, reference should be made to the interest- 
ing but very technical works of H. L. Moore (cf. Eco- 
nomic Cycles, and articles in the Quarterly Journal 
of Economics, February, August, and November, 
1921). Professor Moore discovers some relation to 
exist between a weather cycle of heavier and lighter 
rainfall and the business cycle, the average duration of 
each cycle being about eight years. The moist years 
bring as a rule larger crops, with some tendency to a 
lowering of the general price level, followed during 
the drier years by a rise in the price level. This rela- 
tion seems to be clearer for English prices than for 
American. The weather cycle is shown to be synchro- 
nous in several countries, and to be correlated with a 
cycle of barometric pressure, which in turn may have 
astronomical causes. But while the subject is very 
interesting and valuable theoretically, the correlations 
disclosed are too irregular to be of great practical 
value. 


REFERENCES 

Babson, Roger W., Business Barometers. 

Davies, G. R., “Social Aspects of the Business uycle,” Quar- 
terly Journal of the University of North Dakota, January, 
1922. 

Hurlin, Ralph G., “The Long-Time Trend of Prices in the 
United States,'’ The Annalist. July 4, 1921. 



TRENDS AND CYCLES 


127 


Jordan, D. F., Business Forecasting. 

Kemmerer, E. "W., High Prices and Deflation. 

Mitchell, Wesley C., Business Cycles. 

Moore, Henry L., Economic Cycles: Their Law and Cause. 
Peddle, John B., The Construction of Graphical Charts, 
Chapter VI. 

Persons, W. W., “Construction of a Business Barometer 
Based upon Annual Data,” American Economic Review, 
December, 1916, pp. 739-769. 

Piatt, Andrew A., National Monetary Commission. 

Tingley, Richard H., “Another Yardstick of Banking Condi- 
tions, ’ ' The Annalist, November 28, 1921, p. 511. 


EXERCISES 

1. Plot the data for production and price of wheat, 1870- 
1920 (Tables XIV and XIV-A, pp. 76 and 77), and 
draw a free-hand trend for each series. 

2. As in Exercise 1, construct free-hand trends for the pro- 
duction and price of corn. 

3. Plot the index of physical production of crops, 1870- 
1920 (Table XV, p. 81), and draw a straight-line trend 
by inspection. 

4. Apply the method of semi-averages to the construction 
of a straight-line trend for the data of the preceding 
exercise. 

5. Compute a five-year moving average of per capita pro- 
duction, 1890-1918. Plot both the trend and the data 
from which it is derived on 17"x22" cross-section paper, 
and measure graphically the deviations from the trend. 

6. Plot on a horizontal axis the deviations obtained in 
the preceding exercise. Find the average deviation, and 
indicate this on the graph for both the plus and the minus 
deviations. 

7. Compute and plot a straight-line trend (line of least 
squares) for the following price index. 


Yiear Prices (3 articles) 

1897 85 

1898 70 

1899 90 

1900 130 

1901 125 



128 INTRODUCTION TO ECONOMIC STATISTICS 


8. Find straight-line trends (lines of least squares) for 
the production and price of pig iron as given below, 
taking each year separately. 


PRODUCTION AND PRICE OP PIG IRON ( Iron Age) 

(000 omitted from production) 


Month 

1909 

1910 

1911 

1912 

1918 


Tons 

9 

Ton* 

9 

Tons 

9 

Tons 

9 

Tons 

9 

Jan. . 

1,797 

16.25 

2,608 

17.25 

1,759 

14.25 

2,057 

13.25 

2,795 

16.95 

Feb.. 

1,707 

16.13 

2,397 

17.06 

1,794 

14.25 

2,100 

13.31 

2,586 

16.69 

March 

1,832 

15.05 

2,617 

16.30 

2,188 

14.25 

2,405 

13.50 

2,763 

16.31 

April. 

1,738 

14.25 

2,483 

15.37 

2,065 

14.25 

2,375 

13.75 

2,752 

15.65 

May.. 

1,883 

14.50 

2,390 

15.00 

1,893 

13.95 

2,512 

14.15 

2,822 

14.94 

June . 

1,930 

14.70 

2,265 

14.85 

1,787 

13.44 

2,440 

14.25 

2,628 

14.06 

July. . 

2,103 

1575 

2,148 

14.75 

1,793 

13.25 

2,410 

14.70 

2,560 

13.75 

Aug.. 

2,248 

16-38 

2,106 

14.31 

1,926 

13.45 

2,512 

15.06 

2,543 

1406 

Sept- . 

2,385 

17.35 

2,056 

14.25 

1,977 

13.31 

2,463 

15.87 

2,505 

14.25 

Oct. . 

2,599 

17.88 

2,093 

14.25 

2,102 

13.25 

2,689 

16.80 

2,546 

14.35 

Nov.. 

2,547 

17.75 

1,909 

14.25 

1,999 

13.20 

2,630 

17.25 

2,233 

13.87 

Dec. . 

2,635 

17.45 

1,777 

14.25 

2,043 

13.19 

2,782 

17.25 

1,983 

13.95 

Total 

25,410 

16.12 

26^855~ 

15.16 

23,329 

13.67 

29,383^ 

>14.93 

30,722 

14.90 


(Reprinted, with permission, from Bnbson’s Desk Sheet.) 


9. Find the average index of per capita physical pro- 
duction in the United States (page 81) for each dec- 
ade from 1870 to 1919. Using the resulting five aver- 
ages, construct a line of least squares. Plot the original 
data and the trend thus found. 

10. Find the average index of production of iron and copper 
by five year periods from 1870 to 1914. Plot these aver- 
ages, and fit to them a parabola of the second degree. 

11. Compute and plot the cycles of the interest rate, 1909- 
1913, using the data and index of seasonal variations 
presented in Table XVII, page 117. 

12. Using the data given below, find an index of seasonal 
variations in exports of merchandise (a) t by the method 
of averages, and (b) by ther link-relative method. 










TRENDS AND CYCLES 129 


EXPORTS OP MERCHANDISE, UNITED STATES, 1909-1913 
(In Millions of Dollars) 



1909 

1910 

1911 

1912 

1913 

January 

.... 157 

144 

197 

202 

227 

February 

.... 126 

125 

176 

199 

194 

March 

.... 139 

144 

162 

205 

187 

April 

.... 125 

133 

158 

179 

200 

May 

.... 123 

131 

153 

175 

195 

June 

.... 117 

128 

142 

138 

163 

July 

.... 109 

115 

128 

149 

161 

August 

.... 110 

135 

144 

168 

188 

September 

.... 154 

169 

196 

200 

218 

October 

201 

208 

210 

255 

272 

November 

194 

207 

202 

278 

246 

December 

172 

229 

225 

250 

233 


(December, 1908, 170) 


13. The following table, adapted from the Yearbook of the 
Department of Agriculture, 1918, gives the farm price 
of wheat in the United States (cents per bushel) on the 
first day of each month for the years 1909 to 1913 
inclusive. Using the method of link relatives, derive an 
index of seasonal variations. 



1909 

1910 

1911 

1912 

1913 

January 1 .... 

93.5 

103.4 

88.6 

88.0 

76.2 

February 1 . . . 

95.2 

105.0 

89.8 

90.4 

79.9 

March 1 

103.9 

105.1 

85.4 

90.7 

80.6 

April 1 

107.0 

104.5 

83.8 

92.5 

79.1 

May 1 

115.9 

99.9 

84.6 

99.7 

80.9 

June 1 

123.5 

97.6 

86.3 

102.8 

82.7 

July 1 

120.8 

95.3 

84.3 

99.0 

81.4 

August 1 

107.1 

98.9 

82.7 

89.7 

77.1 

September 1 . . 

95.2 

95.8 

84.8 

85.8 

77.1 

October 1 

94.6 

93.7 

88.4 

83.4 

77.9 

November 1 . . . 

99.9 

90.5 

91.5 

83.8 

77.0. 

December 1 . . . 

98.6 

88.3 

87.4 

76.0 

79.9 


(December 1, 1908, 92.2) 


14. Using the method of averages, derive an index of sea- 
sonal variations from the following table of farm prices 
of wheat in the United States (cents per bushel) for the 
years 1909 to 1918, inclusive. 



130 INTRODUCTION TO ECONOMIC STATISTICS 


Yearly Monthly 

averages averages 


1909 

101.3 

Jan. 1 

109.4 

1910 

96.5 

Feb. 1 

115.2 

1911 

86.9 

March 1 

115.2 

1912 

87.4 

April 1 

116.4 

1913 

78.4 

May 1 

125.6 

1914 

88.4 

June 1 

126.0 

1915 

105.2 

July 1 

117.7 

1916 

125.9 

Aug. 1 

117.9 

1917 

200.8 

Sept. 1 

117.4 

1918 

204.3 

Oct. 1 

116.5 



Nov. 1 

119.7 



Dec. 1 

118.6 


15. Compute and plot the cycles in the price of wheat, 1909- 
1913, as measured from a line of least squares. Use the 
data of Exercise 13. 

16. From financial journals and other sources obtain monthly 
or weekly quotations for recent and current dates illus- 
trating barometric subjects such as are mentioned on 
pages 66 and 124. Plot the data and construct trends. 
On the basis of these barometers and such other informa- 
tion as may be available, make a forecast of business 
conditions for the immediate future, allowing for seasonal 
variations. 



CHAPTER VI 


CORRELATION 

Correlation Defined, A study of the cycles of busi- 
ness barometers leads to the problem of classifying 
and measuring the relationships among them. These 
relationships may be discovered in various forms and 
degrees. For example, the cycles of building permits 
and pig iron production will be found to move some- 
what closely together. Then again, stock prices and 
commodity prices form similar waves, though the lat- 
ter usually follow a few months behind. On the other 
hand, commodity prices and business failures show 
opposite movements — when one is up the other is down. 
All such relationships between two sets of data are 
known as correlations. When the two sets of cycles 
agree, the correlation is called positive ; when they dis- 
agree, it is negative. When the cycles are not quite 
coincident in point of time, the one which follows is 
said to show a “lag” of a given interval . 1 Correia- 

‘The term “lag” is also sometimes used to designate a smaller 
degree of variation occurring in one series than in another comparable 
to it. Thus the lag of retail prices behind wholesale prices is largely 
a matter of degree, and only slightly a matter of time. But it is the 
time element only that enters into the calculation of correlation. In 
allowing for the lag, the series coming later in time is considered as 
moved back by the length of the lag, and the corresponding items are 
then compared. When the length of the lag is difficult to determine, 
estimates must be made, and the correlation computed on the basis of 
each estimate. The lag resulting *in the most marked correlation is 
assumed to be the correct one. In determining the lag it is often 
necessary to take into account the causal relation existing between the 
two series under consideration, as in a case where the assumption of a 
lag for one series results in a positive correlation, while the transfer 
of the lag to the other series results in a negative correlation. 

131 



132 INTRODU CTION TO ECONOMIC STATISTICS 


tion carries the idea of a fundamental relationship: 
either one phenomenon acts or reacts upon the other, 
or both are due to common causes. The principle is 
not limited to time series. Comparison might be made, 
for example, between the advertising and the rate of 
earnings of given business firms. But in any case 
the principle would be the same as before, and the 
methods used would be practically identical. 

The Graphic Method. A fairly good study of cor- 
relation in economic phenomena can often be made 
without any more elaborate methods than those al- 
ready described in isolating and plotting the cycles. 
Two time series, reduced to standard deviation cycles 
and plotted on equal horizontal scales may be very well 
compared by superimposing one on the other. To fa. 
cilitate comparison, one may be drawn on a trans- 
parent medium, such as tracing cloth ; or a mimeoscope 
may be used. By shifting the superimposed cycles 
back and forth, the lag may be fairly accurately deter- 
mined. The correlation may be described as posi- 
tive or negative, high, moderate or low, and the lag 
and its consistency may be stated. This is the method 
adopted by the Review of Economic Statistics in its 
study of the correlations existing among 24 business 
barometric series for the years 1903-1914. 

Method of Concurrent Deviations. It is often desir- 
able, however, to measure the degree of correlation in 
precise mathematical terms. This is particularly true 
when correlated data are being used in support of a 
given theory. In order to obtain a precise result, 
mathematical methods developed originally for use in 
biometrics have been borrowed and adapted. 



CORRELATION 


133 


As an introduction to the mathematical methods of 
measuring correlation, we may take up a simple for- 
mula which is well adapted to the comparison of short- 
time fluctuations. The formula is the expression of 
what is called the method of concurrent deviations. It 
may be illustrated by applying it to a comparison be- 
tween the short-time fluctuations of real wages and per 
capita production in the United States (cf. pp. 51, 53, 
and 81). The fluctuations may be most readily deter- 
mined by reference to a graph of each series (cf. Fig. 
6, p. 101). If at any given year the line makes an in- 
verted angle, like a caret (A), the fluctuation is reg- 
istered on the index as positive ( + ). If the angle 
is V-shaped, it is registered as negative (-). If no 
angle is formed, the year is indicated as neutral (0). 
In some cases it may not be possible to determine from 
the graph whether the angle is neutral, or slightly posi- 
tive or negative; in which case resort may be had to 
the data. 

After the deviations of both series have all been 
registered, they are compared across, item by item. 
If in a given year both indexes show a positive fluctua- 
tion, one agreement is counted. If positive and nega- 
tive meet, one disagreement is counted. If one or both 
of the fluctuations of a given year are neutral, one- 
half is added to both the agreements and disagree- 
ments. When this summation is complete, the larger 
of the two totals thus obtained is designated as the 
number of concurrent deviations, denoted in the for- 
mula by the letter C. The sign of the coefficient to be 
obtained by the use of the formula is determined by 
the nature of the concurrences. If they are agree- 



134 INTRODUCTION TO ECONOMIC STATISTICS 


ments, the sign is positive ; if disagreements, the sign 
is negative. The formula for correlation (R) as thus 
measured is : 

— 

In the case of the wage and production indexes just 
mentioned, the number of disagreements, or concur- 
rences, totals to 33*4 and the number of comparisons is 
49. The formula therefore becomes: 

The derivation of the formula is of little importance, 
as it is patterned empirically on the one next to be 
described. The significance of the coefficient will be- 
come evident in the same connection. 

The Pearson Method. We come now to the so-called 
Pearson “r”, the most satisfactory method to apply 
to straight line correlations: that is, to those which 
when graphed show an approximation to a straight 
line rather than a curve. This statement does not refer 
to the trends of the two series taken separately, but 
only to the trend formed by the two sets of cycles 
plotted as x and y, respectively. In explaining the 
method, it is most convenient to begin with two sets 
of cycles, or deviations, already reduced to units con- 
sisting of their respective standard deviations. The 
following table gives two such series, and the process 
of finding their correlation. The two cycles are 
graphed as coordinates in Figure 14, page 136. 

Since the units used in both cases are standard devia- 
tions, the spread on the two axes, as measured by the 



CORRELATION 


135 


CORRELATION OF PRICES (X) AND EMPLOYMENT (Y) 
JANUARY, 1920, TO JANUARY, 1921 
Cycles, in Units of Standard Deviation 


X 

y 

.75 

.59 

.94 

.85 

.91 

.51 

.88 

1.10 

.89 

.68 

.57 

.51 

.37 

.34 

.18 

.42 

-.14 

-.09 

-.53 

-.34 

-.98 

-.59 

-1.74 

-1.27 

-2.10 

-2.71 


Totals 0 0 

12.4427 
r - 13.0334 


X 2 

*y 

.5625 

.4425 

.8836 

.7990 

.8281 

.4641 

.7744 

.9680 

.7921 

.6052 

.3249 

.2907 

.1369 

.1258 

.0324 

.0756 

.0196 

.0126 

.2809 

.1802 

.9604 

.5782 

3.0276 

2.2098 

4.4100 

5.6910 

13.0334 

12.4427 


.96 


cycles squared, must necessarily be equal. If every 
deviation in one series concurs with an equal deviation 
in the other series, the points when plotted will neces- 
sarily fall on a diagonal sloping upward from left to 
right at 45°. If positive deviations concur with nega- 
tive, the points will lie in a diagonal sloping downward 
from left to right at 45°. In the first case a line of least 
squares drawn through the points will necessarily have 
a slope of + 1, and in the second case of —1. These 
are obviously the largest results, both positive, and 
negative, that could be obtained from two such correl- 
ative series. A neutral result of zero would be ob- 
tained if the points as plotted fall in haphazard posi- 
tions about the two axes. The slope of the line of 
least squares (the tangent of its angle with the X-axis) 



136 INTRODUCTION TO ECONOMIC STATISTICS 


is therefore taken as the measure of correlation, 
basic formula is : 


r = 


2 i y 
2x» 


Its 


In ordinary work it is, of course, necessary first to 
find a trend for each series, if the cycles are to be meas- 
ured. If the deviations are taken from the average of 
each series, the general direction and form of the two 
lines will be contrasted. This is equivalent to assum- 



Fiqdxb 14. Correlation of price cycle* (x) and employment cycles (y), 
expressed in units of the standard deviation of each series, respectively. 
Line of least squares (solid line), and line of unit slope (broken line;. 





CORRELATION 


137 


ing a horizontal trend as a basis of measurement. In 
certain cases this comparison may be desired. But 
usually it is advisable to measure either the two trends 
by eliminating the cycles or to measure the cycles as 
taken from the trend. The latter is the usual pro- 
cedure, since interest generally lies in a comparison 
of the cycles. 

The computing of a coefficient of correlation from 
data which require the finding of trends is illustrated 
in Table XVIII. The work is for the most part self- 
explanatory, since the trends are found by processes 
already explained. Instead, however, of using the 
standard deviations as the units in which to express 
the cycles, the original units are employed throughout. 
The reduction to standard deviation units is in effect 
obtained by inserting Hj and t 2 in the denominator of 
the formula for r. But a substitute must be found for 
2x 2 , which in the formula as just used was also in terms 
of the standard deviation. The required substitute is 
N. This may be see n by recalling that in the standard 
/Sx 2 

deviation series 4/ = <x = 1 . Hence 2x 2 must 

equal N. The formula for the line of least squares as 
applied to the previous problem in correlation may 
therefore be transformed into the Pearson correlation 
formula, thus : 


r 



__ 

“ Ncjffg 

2xy 

\/ 2x 2 2y 2 


(both x and y being expressed in 
units of their respective standard 
deviations.) 

(in which x and y are expressed in 
terms of the original units.) 

(an alternate form obtained by sub- 
stitution.) 



CORRELATION OP ANNUAL PRODUCTION OF PIG IRON AND AVERAGE PRICE, U. S., 1903-1913 

(Cycles measured from Line of Least Squares *) 


138 INTRODUCTION TO ECONOMIC STATISTICS 


g 

I 

16.8 

16.6 

16.4 

16.3 

16.1 

16.0 

15.8 

15.6 

15.5 

15.3 

15.2 

tf 

Ifl X 00 iH T-J rH CO iq 

r-i eo ^ b - 

■ 

WJ*^e»JCjJr-lO»H04W^lrt 

PRICE 

PER TON 

ooooooooooo 
r-« t- # © t> ,H 50 rH ©4 t> 05 05. 
t-’ 04 lO <6 CO W <D iO CO ^ 
HHHrlN hhhhhh 

CO 

I 

C4 M <D S oo o H N M W 

00 05 O rH 04 CO Itj CO t-’ 00* 05 

hhnnnnnnnnn 


- 90.0 

- 66.0 

- 69.0 

- 50.6 

- 25.8 

0 

25.8 

54.6 

71.1 

118.8 

155.0 

■ 

l^'^'C^C4rHOrHC4CO'*IO 

PRODUCTION 

(MILLIONS 
OP TONS) 

O M5 O CO co 00 W o 

00 <D CO Ui ic U5 lo CO o f-5 

r-(fH04 04 04rH(M04 04 04CO 

§ 

C0^110^^-00050«H04CO 

O O O O CO O CO rH H H rH 
Cfc05C5O*O505O>05O505G5 

HhHHhHHHHHH 


05 


* 


w 


© 


* 


©4 

<© 

04 


















140 INTRODUCTION TO ECONOMIC STATISTICS 


The Probable Error. It will be seen that the first 
part of Table XVIII records merely the finding of the 
trends (lines of least squares) for the two series. The 
figures have not been carried out beyond approximate- 
ly one per cent accuracy. The production and price 
cycles shown in the latter part of the table are obtained 
by the usual method of subtracting the trend iteins 
ZcT-1 



Figure 15. Variations in production and price of pig iron, United 
States, 1908. Standard deviation units, measured from the line of 
least squares (r = .68). 


from the corresponding items in the original series. 
The standard deviation of each of the resulting series 
of cycles and the summation of the xy’s are next found. 
The Pearson formula is then applied, and the result, 
r = .45, is obtained. To this result is appended what 
is known as the “probable 'error” (P.E.), the word 
“error” being here used merely in the sense of di- 
vergence, as in the theory of least squares. The for- 
mula for the probable error expresses the quartile 


CORRELATION 


141 


deviation from the coefficient of .45 which wonld nor- 
mally appear by the operation of the laws of chance. 
The probable error therefore gives some idea of the 
range over which the value of r has an even chance 
of deviating, and may be used in estimating the sig- 
nificance of the correlation. On the basis of experience 
it is assumed that if the value of r is as low as thirty, 
or if the probable error is as high as one-third of r, 



correlation is barely indicated. If, however, r is as 
high as fifty, and if the probable error is not more 
than one-fifth of r, correlation is clearly indicated. 
Between these limits a correlation may be regarded 
as more or less tentatively indicated. 

The Relation of Output to Price. The result 
obtained in Table XVIII indicates that in the iron in- 
dustry production is directly adjusted to meet demand 
as reflected in the price. When the price is high, pro- 



142 INTRODUCTION TO ECONOMIC STATISTICS 


duction therefore is high, and vice versa. This relation 
is seen to be very close in certain years if monthly 
data are used (cf. Figure 15). The case is doubtless 
typical of a large part of manufacturing, particularly 
when the process is relatively short. But in agricul- 
ture, where the maturing of the output is determined 
by the seasons, contemporaneous movements of out- 
put and prices are negatively related. This fact is 
shown indirectly by Figure 16. About 75% of* the 
world’s wheat crop is harvested in the months of June, 
July, and August, with a consequent rapid depression 



Figure 17. Comparison of corrected figures for index of physical 
production of crops and index of crop prices. The long tine movements, 
or secular trends, have been eliminated, and the two series have been 
expressed in comparable units. 

Reproduced from the Review of Economic Statistics, by permission 
of the editors. 


of the price. Extremely large crops are normally fol- 
lowed for several months by unusually low prices, and 
vice versa. This fact is depicted in Figure 17, where 
agricultural production is* plotted against the subse- 
quent price, rather than the average price for the 
year. A comparison of the cycles of crops and of 
manufactures suggests that the former exert some pos- 




TABLE XIX 

CORRELATION OF ENTRANCE GRADES AND SCHOLARSHIP GRADES 


CORRELATION 


143 



Entrance examination grades— weighted averages. 


Njicra 

= (^~ K.K.') — 

V N / ffiui 

= [2.846- (—027) ] i 39 ^ 3 34 = .62 ; P.B. 







144 INTRODUCTION TO ECONOMIC STATISTICS 


itive influence upon the latter within a period of a year 
or two, but the relation is not very regular. 1 

Correlation from Frequency Tables. A somewhat 
difficult application of the Pearson method of meas- 
uring correlation is encountered when the two series 
which are to be compared are each compiled in fre- 
quency tables. The case is illustrated in Table XIX, 
where average entrance examination grades and aver- 
age scholarship grades for the four college years are 
compared. The entrance examination data are ex- 
pressed in per cents, tabulated to the nearest multiple 
of five. The scholarship grades were expressed pri- 
marily in six groups, ranking downward in order from 
the first to the sixth. The averaging of such groups 
for four years gave results carried out to fourths of a 
group, as shown in the table. For convenience of cal- 
culation the values of both scales are converted into 
unit intervals measuring the deviations, the new scales 
centering at a zero set opposite the values (70 and 
Zy±) which are assumed as the averages. The number 
of frequencies for the combined series is written at 
the appropriate coordinate points in the body of the 
table, while the frequencies for each scale taken inde- 
pendently are written to the right and below (column 
F and row F). 

The initial steps in the computation will be readily 
understood by reference to the short-cut method of 
finding the standard deviation. By this method the 
standard deviations for both series, respectively, are 
determined. The finding of 2xy is a somewhat long 

1 H.« L. Moore, in Economic Cycles , finds a positive correlation be- 
tween an eight-year crop cycle and general prices, allowing a four- 
year lag to the latter. 



CORRELATION 


145 


process, since each of the frequencies at the coordinate 
points in the body of the table must be taken into ac- 
count. Each of these frequencies is multiplied by its 
two coordinate values, and the products are totaled. 
To illustrate, the first three columns of frequencies 
give the following results: 

F x y Fxy 

1-6 0 0 
1 - 6-2 12 

2-5-1 10 

1- 41-4 

2- 4-1 8 


By continuing this computation, a total result of 148, 
as the value of 2xy, is obtained. 

Since the two inserted scales measuring the devia- 
tions are not centered with precision at the two axes 
of the table, as determined by the averages of the two 
series, respectively, the plus and minus deviations from 
the assumed averages will not exactly balance. Hence 
a correction must be made in the 2xy, just as in the 
two standard deviations. The corrections (K t and 
K 2 ) applied to the finding of the standard deviations 
are, of course, merely 2FD -f- N. It may be shown 
that 2xy will be increased by the product of the two 
corrections, for every item included. The corrected 
summation of the moments about the coordinate axes 
is therefore expressed : 

2xy - NKjK 2 

In other respects the formula is as previously used. 
For convenience, however, it is written in the revised 
form shown at the foot of the table. 1 Applying the 

t In correlations where the deviations to be contrasted are intended 
to be taken from the average of each series, the coefficient may be 



146 INTRODUCTION TO ECONOMIC STATISTICS 

formula to the data in question, the value of r is found 
to be .62 Hh .06, a well-marked correlation. 

TABLE XX 


COBBELATION OF BANKING OF STATES IN MANUFACTUB- 
ING AND IN LITERACY, U. S., 1860 


STATS 

RANK IN 

MANUFAC- 
TURES 
(CAPITAL 
PER SQUARE 
MILE) 

RANK IN 
LITERACY 
OF NATIVE 
WHITES 

D 

D> 

Alabama 

24 

23 

24 

1 

1 

Arkansas 

29 i 

5 

25 

2 

Connecticut 

3 

2 

21 

22 

27 

14 

17 

13 

25 

18 

5 

1 

Delaware 

8 

13 

6 

4 

169 

36 

16 

4. 

Florida 

28 

Georgia 

23 

Illinois 

16 

14 

2 

Indiana 

3 

a 

Iowa 

26 

13 

10 

7 

169 

100 

49 

40 

Kentucky 

15 

25 

Louisiana 

Maine 

12 

9 

7 

Maryland 

15 

1 

6 

1 

Q/t 

Massachusetts 

2 

OO 

Michigan 

17 

27 

19 

7 

9 

16 

19 

6 

11 

7 

X 

8 

11 

o 

1 

A4 

Mississippi 

04 

121 

A 

Missouri 

New Hampshire 

u 

1 

u 

1 

New Jersey 

4 

X 

7 

X 

49 

1 

49 

1 4 

New York 

6 

22 

1 10 

5 

1 

1 

7 

fJorth Carolina. 

29 

12 

10 

8 

20 

28 

q 

Ohio 

o 

Pennsylvania. 

& 

K 

OR 

Rhode Island 

1 

u 

7 

1 

in 

250 

49 

1 

100 

AA 

South Carolina 

Tennessee 

21 

18 

11 

13 

20 

Vermont 

AU 

Q 

Virginia. 

o 

26 

A 

o 

13 

1/1 

64 

169 

256 

Wisconsin 


x 

ID 

r - 1 - - Jg? = ! 

N(N*-1) 1 

P.E. = .10 

6x1818 
" 29X840" 

.60 

1618 


found directly from the original items. This is done by the use of 
the formula given in Table XIX. An average of zero is assumed for 
both series, and the items are treated as positive deviations from this 
average. The standard deviations are found by the modified formula 
explained on page 41. 



CORRELATION 


147 


The Method of Rank-differences. One further modi- 
fication of the Pearson method of correlation, known 
as the method of rank-differences, may be noted. This 
method has the advantage of simplicity, and is espe- 
cially applicable to comparisons which are made on the 
basis of approximate measurements only. In Table 
XX this method is illustrated by applying it to a com- 
parison of the ranking of twenty-nine states in 1860 
for manufacturing and literacy. The rankings as here 
shown are based upon the census of 1860. In arrang- 
ing such rankings, ties may sometimes occur. In such 
a case the average rank of the tied items is applied to 
each of the items. Thus if the second and third items 
happen to be equal, each is ranked 2 y 2 ; if the second, 
third, and fourth are equal, each is ranked 3. When 
the rankings have been tabulated, as shown, the dif- 
ference between the two ranks for each state is found. 
These differences are then squared, and the squares 
totaled. The formula, as given at the foot of the table, 
is an adaptation from the one last discussed. Apply- 
ing the formula, we find that a correlation of .60 
exists between the two series. This comparison is 
an illustration of a number of interesting relationships 
which may be shown to exist between the economic 
and the social environment. 

Conclusion. The purpose of this chapter will have 
been served if the student has gained a knowledge of 
the simpler methods commonly employed in measuring 
correlation. *The full theory of the subject is very 
complex, and is hardly within the scope of an introduc- 
tory course. A caution must be sounded, however, 
against an undiscriminating application of the meth- 



148 INTRODUCTION TO ECONOMIC STATISTICS 


ods here explained. In particular, conclusions stating 
causal relationships should never be based on mathe- 
matical processes alone. The data, their methods of 
collection, and the concrete realities they are assumed 
to measure, must all be subjected to careful scrutiny. 
The same caution may indeed very properly be ex- 
tended to the whole field of statistical methods. These 
methods should prove to be valuable tools in the inter- 
pretation of physical, biological, and social phenomena, 
but they may be a source of positive error if their use 
is not directed by an adequate comprehension of the 
field of knowledge in which they are employed. 

REFERENCES 

Bowley, Arthur L., Elements of Statistics (4th Edition), Part 
II, Chapters VI-IX. 

Jevons, W. Stanley, The Principles of Science. 

King, W. I., “The Correlation of Historic Economic Vari- 
ables,” Quarterly Publications of the American Statistical 
Association, December, 1917, pp. 847-853. 

Persons, W. W., “The Correlation of Economic Statistics,” 
Quarterly Publications of the American Statistical Asso- 
ciation, December, 1910, pp. 287-322. 

Secrist, Horace, Readings and Problems in Statistical Meth- 
ods, Chapter X. 

West, Carl S., Introduction to Mathematical Statistics. 

Yule, G. U., An Introduction to the Theory of Statistics, 
Chapters IX-XII. 


EXERCISES 

1. On separate sheets of cross-section paper having the same 
horizontal scale, and! with the vertical, scales so ad- 
justed as to bring the deviations as nearly as possible 
to the same measured average, plot the cycles of pro- 
duction and price as obtained in exercises 1 and 2 of 
the preceding chapter. Similarly plot the cycles in the 
interest rate, measuring them from Figure 12, and the 



CORRELATION 


149 


price cycles as shown in Figure 6 (pp. 118 and 101). 
Copy these cycles on tracing paper. Describe the correla- 
tion of production and price of the two crops (allowing a 
lag of one year for prices) , and of the interest and price 
cycles. 

2. By the method of concurrent deviations, measure the 

following correlations (Tables X, XI, and XV, pp. 51, 53, 
and 81) : (a) Wholesale prices and per capita produc- 

tion (both concurrently, and allowing a lag of one year 
for prices), and (b) Wholesale prices and real wages. 

3. Using the table given in exercise 8 of the preceding chap- 
ter, and the lines of least squares there obtained, measure 
by the Pearson “r” the correlation of production and 
price of iron for each year there studied. 

4. From the data on page 53, find the correlation between 
wages and the eost of living for the years 1913-1920 
inclusive (Pearson “r”). Measure the deviations from 
the average of each series, respectively; that is, assume 
a horizontal trend. 

5. Reduce the deviations obtained in the preceding exer- 
cise to units of the standard deviation of each series, 
respectively, and plot as coordinates the two sets of 
deviations thus obtained. Compute the line of least 
squares for the points so plotted, and show that the 
slope of this line is identical with the coefficient of cor- 
relation. 

6. Correlate the following indexes (Pearson “r”) taking 
the deviations from the average, without finding a trend. 
Explain the significance of the result. 


Tear Prices Unemployment 

1912 110 70 

1913 100 120 

1914 90 140 

1915 90 100 

1916 110 70 


7. Find the Pearson coefficient of correlation for the in- 
dexes of normal seasonal variations of merchandise ex- 
ports from the United States and the price of sterling 
exchange at New Tork, as follows: 



150 INTRODUCTION TO ECONOMIC STATISTICS 


Month Exports Sterling 

January 110 100 

February 95 106 

March 99 109 

April 90 115 

May 87 116 

June 80 120 

July 78 119 

August 85 106 

September 98 74 

October 125 70 

November 123 80 

December 130 83 


8. (a) Find the Pearson coefficient of correlation measur- 

ing the relationship between the following indexes of 
seasonal variation in the visible supply and the price of 
wheat in the United States, based on the years 1909-1913. 
(b) Find the coefficient, as before, but assuming that 
prices tend to anticipate the supply by about a month. 



Visible 

Price 

Month 

supply 

(first of mo.) 

January 

139 

97 

February 

130 

100 

March 

122 

101 

April 

112 

101 

May 

89 

103 

June 

69 

106 

July 

52 

104 

August 

60 

100 

September 

77 

97 

October 

99 

97 

November 

118 

98 

December 

133 

96 


9. The following correlation table presents entrance groups 
(vertical scale) and scholarship groups (horizontal 
scale) for a certain class of students. Find the coef- 
ficient of correlation (Pearson “r”) and the probable 
error. 



Price Indexes, May, 1921 


CORRELATION 


151 


y 

10 

9 

8 

7 

6 

5 

4 

3 

2 

1 

i 




2 


i 


2 



2 





i 

i 

i 

1 



3 


i 


1 

4 

2 


3 

i 

1 

4 

1 

i 


2 

1 

1 

i 

1 



5 

1 

3 

2 


1 


2 





10. The following correlation table classifies to the nearest 
twenty-five per cent sixty-nine important commodities 
according to their price indexes (base, 1913) in May, 

1920, and May, 1921 ( Monthly Labor Review, Aug., 

1921, pp. 84-85). The correlation measures approxi- 
mately the evenness of the price changes occurring 
between the two dates. Find Pearson’s “r.” 


Price Indexes, May, 1920 


400 

375 

350 

325 

300 

275 

250 

225 

200 

175 

150 

125 

100 

75 


1 1 

113 11 

111 1 11 
113 12 1 

2 14 3 1 

14 2 12 11 

1 341 1212 

2 111 1 


1 

0 

0 

0 

0 

0 

2 

7 

6 

9 

11 

12 

15 

6 


1248859444545111 1 1 1 


69 


11. The following table shows the ranking of states in 
(a) Noted men bom in state, per 1000 population in 
1880, (b) Population per square mile in 1890, and 
(c) Per cent of urban population in 1890. By the 
method of rank-differences measure the correlations 
existing among these three series. 


(a) (b) (c) 

Alabama 23 24 25.5 

Arkansas 29 28 28 



152 INTRODUCTION TO ECONOMIC STATISTICS 


Connecticut 

4 

4 

3 

Delaware 

8 

9 

11 

Florida 

28 

29 

20 

Georgia 

26 

22 

23 

Illinois 

16 

10 

10 

Indiana 

15 

11 

17 

Iowa 

19 

20 

19 

Kentucky 

18 

12 

21 

Louisiana 

25 

26 

18 

Maine 

5 

27 

9 

Maryland 

10 

7 

8 

Massachusetts 

2 

2 

2 

Michigan 

17 

18.5 

14 

Mississippi 

27 

25 

29 

Missouri 

21 

.16 

16 

New Hampshire 

3 

14 

6 

New Jersey 

11 

3 

5 

New Y ork 

7 

5 

4 

North Carolina 

22 

21 

27 

Ohio 

9 

8 

12 

Pennsylvania 

12 

6 

7 

Rhode Island 

6 

1 

1 

South Carolina 

20 

17 

25.5 

Tennessee 

24 

13 

24 

Vermont 

1 

18.5 

13 

Virginia 

13 

15 

22 

Wisconsin 

14 

23 

15 



APPENDIX I 


LABORATORY MATERIAL AND REFERENCES 

Equipment for Graphic Work. While the larger part of 
statistical work may be done without any systematic train- 
ing in mechanical drawing, yet some degree of skill in this 
field is necessary if graphic representations are to be satis- 
factorily prepared. The necessary degree of skill may readily 
be acquired. The student should provide himself with a 
drawing board, celluloid triangles and irregular curves, a 
ruler with decimal subdivisions of the inch, a ruling pen, 
some round-pointed and fine pens for lettering, India ink, 
several styles of cross-section paper, and a loose-leaf note- 
book. If he is unfamiliar with the use of drafting materials, 
he should read the introductory directions given in an ele- 
mentary treatise on mechanical drawing. 

In addition to the material just mentioned, some of the 
more complicated apparatus used in an engineer’s drafting 
room will be found useful. This equipment may include a 
drafting machine, a pantagraph, a line-spacer, a map-meas- 
urer, and a planimeter. A blue-print outfit is also very use- 
ful, indeed is almost a necessity, unless some improved copy- 
ing device like the photostat is available for use. For ele- 
mentary work a simple 8"xl0" photography printing frame, 
and corresponding blue-print paper, will be found quite sat- 
isfactory and inexpensive. 

Lettering. It is not difficult to learn to draw freehand the 
italic letters used by draftsmen. Directions for such work will 
be found in “Lettering for Draftsmen, Engineers, and Stu- 
dents,” by Charles W. Reinhardt (D. Van Nostrand Com- 
pany, New York), or in other books on the same subject. 

Other Material. There are several aids to statistical work 
that will materially lighten *the drudgery incidental to long 
mathematical processes. The most common of these is the 
slide rule. A ten-inch rule, giving squares and cubes, will 
be found sufficient for the greater part of the work involving 
multiplication, division, and powers or roots. The slide rule 

153 



154 INTRODUCTION TO ECONOMIC STATISTICS 


is not difficult to use, and should be mastered by every stu- 
dent of statistics. Besides being an inexpensive and portable 
device for mathematical operations, it will be found useful 
in the drawing of logarithmic or ratio graphs, which are now 
coming into general use. If, however, it is necessary to ob- 
tain products or quotients accurate to four or five significant 
figures, a large cylindrical slide rule may be used, such as the 
Thatcher, though this is less convenient and considerably 
more expensive. For powers, roots, and reciprocals, elaborate 
printed tables are obtainable. If possible, an adding machine 
(listing) should be available for occasional use, such as the 
Dalton, the Burroughs, or the Federal. While such a machine 
is a convenience, a calculating machine is an absolute neces- 
sity if very extensive work is to be attempted ; and it is well 
for the student to become acquainted with its operation. Sev- 
eral successful models are now on the market, among which 
may be mentioned the Burroughs, the Comptometer, the Mon- 
roe, and the Marchant. The last two are dial machines, par- 
ticularly adapted for subtractions and divisions. For certain 
kinds of statistical work tabulating machines (Hollerith and 
Powers types) are required, but these machines are so com- 
plex and expensive that they can hardly be made available 
except in the larger laboratories. 

Recording. Laboratory exercises and other statistical work 
should be recorded fully, and should be put in clear and neat 
form. Every graph should be accurately labeled, and the 
units used in each scale should be indicated. When the scales 
of a graph do not begin at the zero point, the initial coor- 
dinates should not be drawn more heavily than the others, 
since they are likely to be looked upon as base lines if so 
drawn. In so far as is practicable, the tables of data from 
which a graph is drawn should accompany the figure, and 
the source should be noted. Graphing and lettering should 
be done in pencil first; the pencil draft may then be com- 
pleted with India ink, and the pencil lines erased with a soft 
rubber. Errors in calculation should not be tolerated. All 
mathematical operations should be performed twice, or some 
other reliable method of checking should be adopted. Data 
copied from an original source should always be carefully 
verified. Tables and mathematical processes jnay most con- 
veniently be recorded on cross-section paper having one-fifth 
or one-sixth inch spacing. In any given study the conclu- 
sions should be brought out clearly, i and their significance ex- 
plained. 

Various Types of Graphs. Most of the types of graphs in 



APPENDIX I 


155 


common use have been illustrated in the preceding pages. 
In addition, mention may be made of certain elementary 
types. One of these is the bar diagram, in which bars of 
uniform width, and proportional in length to given magni- 
tudes, are used. They are drawn horizontally, except in time 
series. When they are subdivided, the parts are distinguished 
by various kinds of cross-hatching and shading. Another is 
the “pie diagram,” in which a circle is subdivided by radii. 
This diagram is particularly adapted to the representation of 
percentage subdivisions, such as the relative expenditures for 
certain classes of goods in a family budget. Another type is 
tfye polar chart, designed for graphing seasonal data. Draw- 
ings of similar surfaces and solids are sometimes used in 
the representation of given magnitudes. It should be re- 
membered that, geometrically, magnitudes compared by the 
use of similar surfaces vary as the square of the dimensions; 
and by the use of similar solids, as the cube of the dimensions. 
Sometimes in such drawings it is explicitly stated that the 
ratio is represented by one dimension only, as when the 
military forces of different countries are set forth by means 
of drawings of soldiers whose heights are proportional to the 
size of the armies. Such drawings are not scientific, how- 
ever, and are justified only in material of a very popular 
nature. In general, the use of similar surfaces and solids 
in the representation of magnitudes is to be avoided. A more 
complex type of graph is the statistical map. This may be 
drawn in so many different ways that a general description is 
impossible. The student having occasion to use it should con- 
sult the excellent examples contained in the Statistical Atlas 
of the United States. 

Sources , References , and Tables. Brief summaries are 
given below of the principal sources of statistical material, 
and the textbook references and statistical tables which are 
most likely to be of use in connection with an introductory 
course. 


SOURCES OF STATISTICAL DATA 

Aldrich Report ( Senate Report No. 1394) 

Annalist 

BradstreeVs 

Commercial and Financial Chronicle 
Dun f s Review 
Federal Reserve Bulletin 
Financial Review (year book) 



156 INTRODUCTION TO ECONOMIC STATISTICS 

Monthly Labor Review 

Monthly Review ( Federal Reserve Bank of New York) 
Monthly Summary of Foreign Commerce of the United States 
Review of Economic Statistics ( Harvard ) 

Statesman’s Yearbook 

Statistical Abstract of the United States ( yearbook ) 
Statistical services: Bdbson’s, Banker’s Statistical Corpora- 
tion , Brookmire’s, and Prentice-Hall. 

Survey of Current Business. 

United States Census 

Weather , Crops, and Markets (U. S. Dept, of Agriculture) 
World Almanac 

Yearbook of the Department of Agriculture 

TEXTBOOKS , TABLES , AND GENERAL 
REFERENCES 

American Economic Review ( Bi-monthly ) 

Bailey and Cummings, Statistics 

Barker, E. H., Computing Tables and Formulas 

Barlow’s Tables 

Bowley, A. L., Elements of Statistics 

Brinton, W. C., Graphic Methods for Presenting Facts 

Copeland, M. T., Business Statistics 

Davenport, C. B., Statistical Methods 

Jordan, D. F., Business Forecasting 

Journal of Political Economy ( Monthly ) 

King, W. I., Elements of Statistical Method 
Marshall, Wm. C., Graphical Methods 
Quarterly Journal of Economics 

Quarterly Publications of the American Statistical Associa- 
tion 

Secrist, H., An Introduction to Statistical Methods 
Secrist, H., Readings and Problems in Statistical Methods 
West, C. S., Introduction to Mathematical Statistics 
Whipple, 6. C., Vital Statistics 



APPENDIX II 



TABLE OF 

1 POWERS 

AND ROOTS 





Square 

Cube 

No. 

Square 

Cube 

Root 

Root 

1 

1 

1 

1.000 

1.000 

2 

4 

8 

1.414 

1.259 

3 

9 

27 

1.732 

1.442 

4 

16 

64 

2.000 

1.587 

5 

25 

125 

2.236 

1.709 

*6 

36 

216 

2.449 

1.817 

7 

49 

343 

2.645 

1.912 

8 

64 

512 

2.828 

2.000 

9 

81 

729 

3.000 

2.080 

10 

100 

1,000 

3.162 

2.154 

11 

121 

1,331 

3.316 

2.223 

12 

144 

1,728 

3.464 

2.289 

13 

169 

2,197 

3.605 

2.351 

14 

196 

2,744 

3.741 

2.410 

15 

225 

3,375 

3.872 

2.466 

16 

256 

4,096 

4.000 

2.519 

17 

289 

4,913 

4.123 

2.571 

18 

324 

5,832 

4.242 

2.620 

19 

361 

6,859 

4.358 

2.668 

20 

400 

8,000 

4.472 

2.714 

21 

441 

9,261 

4.582 

2.758 

22 

484 

10,648 

4.690 

2.802 

23 

529 

12,167 

4.795 

2.843 

24 

576 

13,824 

4.898 

2.884 

25 

625 

15,625 

5.000 

2.924 

26 

676 

17,576 

5.099 

2.962 

27 

729 

19,683 

5.196 

3.000 

28 

784 

21,952 

5.291 

3.036 

29 

841 

24,389 

5.385 

3.072 

30 

900 

27,000 

5.477 

3.107 

31 

961 

29,791 

5.567 

3.141 

32 

1,024 

32,768 

5.656 

3.174 

33 

1,089 

35,937 

5.744 

3.207 

34 

1,156 

39,304 

5.830 

3.239 

35 

1,225 

42,875 

5.916 

3.271 


157 



158 INTRODUCTION TO ECONOMIC STATISTICS 





Square 

Cube 

No. 

Square 

Cube 

Root 

Root 

36 

1,296 

46,656 

6.000 

3.301 

37 

1,369 

50,653 

6.082 

3.332 

38 

1,444 

54,872 

6.164 

3.361 

39 

1,521 

59,319 

6.244 

3.391 

40 

1,600 

64,000 

6.324 

3.419 

41 

1,681 

68,921 

6.403 

3.448 

42 

1,764 

74,088 

6.480 

3.476 

43 

1,849 

79,507 

6.557 

3.503 

44 

1,936 

85,184 

6.633 

3.530 

45 

2,025 

91,125 

6.708 

3.1556 

46 

2,116 

97,336 

6.782 

3.583 

47 

2,209 

103,823 

6.855 

3.608' 

48 

2,304 

110,592 

6.928 

3.634 

49 

2,401 

117,649 

7.000 

3.659 

50 

2,500 

125,000 

7.071 

3.684 

51 

2,601 

132,651 

7.141 

3.708 

52 

2,704 

140,608 

7.211 

3.732 

53 

2,809 

148,877 

7.280 

3.756 

54 

2,916 

157,464 

7.348 

3.779 

55 

3,025 

166,375 

7.416 

3.802 

56 

3,136 

175,616 

7.483 

3.825 

57 

3,249 

185,193 

7.549 

3.848 

58 

3,364 

195,112 

7.615 

3.870 

59 

3,481 

205,379 

7.681 

3.892 

60 

3,600 

216,000 

7.745 

3.914 

61 

3,721 

226,981 

7.810 

3.936 

62 

3,844 

238,328 

7.874 

3.957 

63 

3,969 

250,047 

7.937 

3.979 

64 

4,096 

262,144 

8.000 

4.000 

65 

4,225 

274,625 

8.062 

4.020 

66 

4,356 

287,496 

8.124 

4.041 

67 

4,489 

300,763 

8.185 

4.061 

68 

4,624 

314,432 

8*246 

4.081 

69 

4,761 

328,509 

8.306 

4.101 

70 

4,900 

343,000 

8.366 

4.121 

71 

5,041 

357,911 

8.426 

4.140 

72i 

5,184 

373,248 

8.485 

4.160 



No. 

Square 

73 

5,329 

74 

5,476 

75 

5,625 

76 

5,776 

77 

5,929 

78 

6,084 

79 

6,241 

80 

6,400 

6l 

6,561 

82 

6,724 

83 

6,889 

84 

7,056 

85 

7,225 

86 

7,396 

87 

7,569 

88 

7,744 

89 

7,921 

90 

8,100 

91 

8,281 

92 

8,464 

93 

8,649 

94 

8,836 

95 

9,025 

96 

9,216 

97 

9,409 

98 

9,604 

99 

9,801 

100 

10,000 


APPENDIX II 


Cube 

389,017 

405,224 

421,875 

438,976 

456,533 

474,552 

493,039 

512.000 

531,441 

551,368 

571,787 

592,704 

614,125 

636,056 

658,503 

681,472 

704,969 

729.000 

753,571 

778,688 

804,357 

830,584 

857,375 

884,736 

912,673 

941,192 

970,299 

1,000,000 



159 

Square 

Cube 

Boot 

Root 

8.544 

4.179 

8.602 

4.198 

8.660 

4.217 

8.717 

4.235 

8.774 

4.254 

8.831 

4.272 

8.888 

4.290 

8.944 

4.308 

9.000 

4.326 

9.055 

4.344 

9.110 

4.362 

9.165 

4.379 

9.219 

4.396 

9.273 

4.414 

9.327 

4.431 

9.380 

4.447 

9.433 

4.464 

9.486 

4.481 

9.539 

4.497 

9.591 

4.514 

9.643 

4.530 

9.695 

4.546 

9.746 

4.562 

9.797 

4.578 

9.848 

4.594 

9.899 

4.610 

9.949 

4.626 

10.000 

4.641 


Note: In the above table the last two columns are correct to three 
decimal places, without allowance for decimals dropped. 




INDEX 


A E 


Aggregate method, 56 
Aldrich Report, 4 
American Economic Review, 53, 

96, 124, 127 

American Statistical Association, 
96-148 

Annalist, The, 97 
Annalist Barometer, 125 
Arithmetic mean, 20 
Average, 20 
Average deviation, 36 

B 

Babson, Roger W., 76, 124, 126 
Babson’s index, 65 
Bailey, W. B., 19 

Bowley, Arthur L., 19, 44, 67, 148 
Bradstreet’s index, 64, 124 
Bureau of Labor Statistics, 51, 56, 

63, 113, 124 
Barnett, George E., 66 
Brinton, W. C., 44 
Business barometers, 115 
Business cycles, 123 
Business statistics, 3 

C 

Circulation, 98 
Concurrent deviations, 132 
Continuous series, 5 
Correlation, 131 
Cost of living, 53 
Corn, 76, 77 
Cummings, John, 19 
Cycles, 101, 113, 137 
• 

V 

Day, E. E., 19, 96 
Derived table, 13 
Discrete series, 5 
Dun’s index, 64 

161 


Exports, 129, 150 
F 

Falkner, R. P., 3 
Farm wages, 70 

Federal Reserve Bank of New 
York, 124 

Federal Reserve Board, 64 
Federal Reserve Bulletin, 65 
Field, J. A., 44 
Fisher, Irving, 44, 67, 96 
Fisher’s index, 86 
Food prices, 56 
Foreign exchange, 68 
Free-hand method, 100 
Frequency curve, 9 
Frequency polygon, 16 

G 

Graphic work, 153 
H 

Hansen, A. H., 124 
Howard, Stanley E., 49, 67 
Hoffmann, F. L., 96 
Hurlin, Ralph G., 126 

I 

Income distribution, 92 

Indexes, 47, 62, 65 

Index of physical production, 75 

Index of quantity, 74 

Index of value, 74 

Ingalls, W. R.. 96 

Interest rates, 118 

Interpolation, 30 

J 

Jevons, Stanley W., 148 
Jordon, D. F., 121, 126, 127 



162 


INDEX 


E 

Kemmerer, E. W., 88, 123, 127 
King, W. I., 44, 96, 98 
Knauth, Oswald W., 92 
Koren, John, 19 

L 

Lettering, 153 

Line of least squares, 105, 135 
Link-relative method, 120 
Lorenz curve, 40, 44 

M 

Machine tabulation, 18 
Macauley, Frederick R., 92 
Marshall, Wm. C., 44 
Median, the, 27 
Meeker, Royal, 67, 96 
Method of rank-differences, 147 
Method of averages, 118 
Method of semi-averages, 102 
Method of standard quantities, 83 
Mitchell, Wesley C., 67, 92, 127 
Mode, 21 

Monthly Labor Review, 29, 70 
Monthly Review, 118, 123 
Moore, H. L., 126, 127, 144 
Moving average, 102 

N 

National Bureau of Economic Re- 
search, 91, 96 
National income, 91 
National Industrial Conference 
Board, 54 
Normal curve, 10 

O 

Ogive, 34, 36 

P 

Parabola trend, 110 
Pearson method, 134 
Persons, W. W., 121, 127, 148 
Peddle, John B., 127 
Pig iron, 76, 77, 128 
Physical production in U. S., 80, 
81, 98 

Pratt, Andrew A., 127 
Price indexes, theory of, 82 
Prices, 51. 55, 60, 63, 65, 77, 113, 
128, 142 


Primary table, 5 
Probable error, 30, 140 
Property ownership, 96 
Proportional expenditure method, 
57 

Pareto’s law, 94 

Q 

Quantity indexes, 74 
Quantity theory, 87 
Quartile deviation, 29 
Quartile dispersion, 32 
Quarterly Journal of Economics, 
109, 126 

B 

Rank -differences, 147 
Ratio paper, 33 
Real wages, 48, 51, 53 
Rectangular histogram, 25 
Reference list, 156 
Review of Economic Statistics, 96, 
121, 123, 124, 126 
Royal Statistical Society, 67 
Rugg, H. O., 19 

S 

Salaries, 16 

Salaries in Universities, 28 
Schedules, 4 
Seasonal index, 120 
Seasonal variations, 116 
Secrist, Horace, 19, 44, 67, 148 
Semi-logarithmic paper, 33 
Skewness, 43 
Slichter, C. S., 105 
Sources of data, 155 
Standard price, 78 
Statistical units, 17 
Stewart, Walter W., 67, 96 

T 

Tabulation methods, 18 
Tally sheet, 11 
Tingley, Richard H., 127 
Trends, nature *of, 100 
Types of graphs, 154 
Types compared, 27 

V 

Value indexes, 74 

Visible supply of wheat, 141, 150 



INDEX 


163 


w 

Wages and salaries, 16 
Wage indexes, 49, 51, 53, 70 
Wage roll, 7 
Walsh, 0. M., 97 


Weighted average, 20 
Weights, cost of living, 71 
West, Carl 8., 148 
Wheat, 76, 77, 129, 141, 150 
Whipple, G. C., 44 
Working, Holbrook, 97 










